Regarding the Trillium, code-named6th generation TPUMohan Pichika, product manager of Google Cloud, further elaborated on this earlier and emphasized that Google Cloud services currently offer more diverse acceleration computing component options to meet the service deployment application needs of different customers.

TPU is just one part of Google Cloud's accelerated computing
Regarding the current market goal of developing general artificial intelligence (AGI), and the question of whether unified acceleration components can support computing needs, Mohan Pichika believes that this ideal is not yet achievable. Therefore, the key is to provide appropriate acceleration components for different computing needs.
For example, in the current Google Cloud service, in addition to providing its own TPU acceleration components, it actually cooperates with processor manufacturers such as Intel, AMD, and NVDIA, and will also provide customized processors built on the Arm Neroverse architecture in the future.Axion, thereby meeting the differences in computing deployment needs of different customers.
The first-generation TPU, introduced in 2015, was designed primarily to accelerate large-scale computations for Google's own services, such as YouTube and Google Search. Since the launch of the second-generation TPU in 1, Google has made its TPU and subsequent iterations available to all Google Cloud customers worldwide (including Taiwan). This includes the sixth-generation TPU, code-named Trillium, launched last year. While availability may vary by region, it is generally available to all Google Cloud customers.
However, Mohan Pichika also explained that different acceleration components have different optimizations for handling projects, and considering performance and price costs, Google Cloud services continue to offer different acceleration component options, allowing different customers to choose the best acceleration component to deploy services while also meeting the computing needs of different AI model scales.
During the Google NEXT'24 presentation last year, Mark Lohmeyer, Vice President of Google Cloud and General Manager of Computing and Artificial Intelligence/Machine Learning Architecture, revealed that Google may plan to launch more customized processors in the future, which means that Google will also provide customized processors for different computing needs.Customized processors with differentiated functional designs.
TPU itself aims to accelerate large-scale computing and improve machine learning efficiency.
The TPU itself is primarily designed to accelerate large-scale computing and improve machine learning efficiency. This includes the enhanced matrix multiplication unit (MXU), increased operating clock speed, and higher HBM memory capacity in the 6th-generation TPU, all aimed at accelerating computing efficiency. Furthermore, the Jupiter network architecture enables the creation of larger Pod computing scales (up to 256 clusters per Pod), further expanding to supercomputing scale, allowing all TPUs to interconnect and operate at petabyte-per-second transmission speeds.
However, Google did not provide specific details on whether the 6th-generation TPUs' use of customized optical communication technology to achieve direct chip-to-chip connectivity, thereby increasing data transmission bandwidth, is similar to NVIDIA's NVLink or the UltraFusion design used by Apple in its Apple Silicon processors.
以目前第6代TPU設計來看,訓練效能提升為前一代的4倍以上、推論處理量峰值表現增加為3倍、能源效率提升67%、每個晶片峰值效能比前一代提升4.7倍,而HBM記憶體容量提升1倍、晶片間互連網路傳輸頻寬提升1倍,並且能以Jupiter網路架構連接多達10萬組TPU,形成龐大運算網路。
In addition, the 6th generation TPU has a 2.5 times higher training performance per dollar than the previous generation, and a 1.4 times higher inference performance, thus emphasizing a significant improvement in sustainable performance.
There are currently no plans to follow up with local AI device solutions or small supercomputers.
At CES 2025, NVIDIA unveiled its "Project DIGITS," an ultra-compact AI supercomputer built using its Blackwell architecture GPU and customized Arm architecture processors. Qualcomm also used its Cloud AI accelerator to create local artificial intelligence device solutions. Is Google also interested in launching a similar project?
Mohan Pichika said that Google currently has no such plan because it is not a chip design and production company and is mainly focused on meeting customers' cloud deployment application needs. Therefore, it believes that currently providing TPU acceleration components and cooperating with Intel, AMD, NVIDIA and other companies on different computing component applications will make it easier to meet customer needs.


