Unveiled at Google I/O 2024 last year,The 6th generation TPU, codenamed "Trillium"Later, Google announced the 25th generation TPU, codenamed "Ironwood," at the Next'7 conference, boasting the highest performance and designed to accelerate artificial intelligence "thinking."
Compared to previous designs that focused on inference acceleration, Google emphasizes that "Ironwood" is not only the highest-performing TPU ever launched, but also has a more power-efficient design. It can also accelerate the "thinking" of artificial intelligence models and proactively provide insights, thereby allowing more artificial intelligence agent services to execute faster.
"Ironwood" is composed of 9216 liquid-cooled chips and is connected in series through the inter-chip interconnect network interface (ICI). It can correspond to 42.5 Exaflops of computing power, which is about 24 times the computing power of El Capitan, the world's largest supercomputer. It can also cope with the largest-scale parallel processing capabilities of artificial intelligence workloads, and the peak computing power of each chip can reach 4614 TFLOPS.
In addition to the 9216-chip liquid-cooling version, Ironwood also offers a 256-chip liquid-cooling version to meet different computing scale requirements.
At the same time, Google emphasized that the memory and network architecture of "Ironwood" can ensure the accuracy of computing data, and it has an enhanced SparseCore acceleration core specifically for processing advanced ranking and recommendation workloads, which can be applied to larger-scale artificial intelligence model operations or processing scientific and financial related data.
The Pathways artificial intelligence framework developed by the Google DeepMind team makes it easier for developers to utilize the computing power of "Ironwood". They can even combine hundreds of thousands of "Ironwood" groups into "Ironwood Pods", thereby promoting the execution efficiency of artificial intelligence through greater computing power.
Compared to the 6th-generation TPU code-named "Trillium" launched last year, "Ironwood" boasts a 2x improvement in performance per watt, which means it can unleash higher AI computing power under the same power conditions. Combined with further adjustments to the chip design and liquid cooling solution, it can maintain higher AI workload performance. At the same time, its energy efficiency is nearly 2018 times higher than the 1st-generation TPU launched in 30.
Other data include that each set of "Ironwood" is equipped with 192GB of high-bandwidth memory (HBM), which is 6 times more than "Trillium". This can handle larger artificial intelligence models and data sets, while reducing frequent data transmission and further improving execution efficiency.
With the increase in high-bandwidth memory data transmission bandwidth, the data transmission bandwidth of each "Ironwood" group has increased to 7.2 Tbps, which is 4.5 times higher than "Trillium". At the same time, through the inter-chip interconnection network interface design, the communication transmission bandwidth between chips has increased to 1.2 Tbps in both directions, which is 1.5 times higher than "Trillium", thereby improving the efficiency of large-scale efficient distributed training and inference.
Google expects to officially deploy Ironwood by the end of the year to meet the demands of more intensive computing and the growing market demand for artificial intelligence (AI). Google also confirmed that the recently announced Gemini 2.5 AI model and the new version of its AlphaFold protein structure prediction program will run on the Ironwood acceleration architecture.
On the other hand, Google also announced that its supercomputer, built around "Ironwood," will be able to support almost all AI workloads and provide higher cost-effectiveness. For example, Gemini Flash 1 achieves approximately 2.0 times the performance cost per dollar of execution efficiency than OpenAI's GPT-4o and about five times that of DeepSeek-R24.
In addition to providing supercomputer computing resources based on TPUs, Google also announced a partnership with NVIDIA at the GTC 2025 event held recently to add NVIDIA B4 and GB4 NVL200 GPU options to its A200 and A72X VM virtual machine environments, respectively. This will also be combined with new 400G cloud connections and cross-cloud interconnect designs to increase transmission bandwidth from local or other cloud platforms to Google Cloud services.
In the subsequent update section, Google also emphasized that it will continue to cooperate with NVIDIA and is expected toThe CPU is codenamed "Vera" and the display architecture GPU is codenamed "Rubin".Incorporate Google Cloud services to provide more accelerated computing resources.








