Announced at this year's Next'25 conferenceCode name "Ironwood"Following the release of its 7th generation TPU, touted for its highest performance and designed to accelerate AI "thinking," Google Cloud announced that the TPU will be officially deployed in the coming weeks. It will support large-scale model training and high-capacity, low-latency AI inference operations, while also meeting the massive computing demands driven by agentic AI workflows.
Google points out that Ironwood is superior to its predecessor, TPU (Code name "Trillium"The 6th generation TPU offers more than 4 times the performance improvement in both training and inference workloads. It is the most powerful and energy-efficient custom chip to date, and can accelerate the "thinking" and proactive insights of artificial intelligence models, thereby enabling more artificial intelligence agent services to run faster.
Google Cloud has also confirmed that it recently signed an agreement with AI partner Anthropic.Multi-year agreements worth hundreds of billions of dollarsThis includes "Ironwood," which was officially launched this time, and will provide up to 100 million TPUs to support the training and service operation of its Claude series models.
Aiming for the future: Ironwood introduces 9.6Tb/s chip interconnect speeds and shares HBM with 1.77PB.
In its previous statement, Google explained that "Ironwood" consists of 9216 liquid-cooled chips connected in series via the inter-chip interconnect (ICI) interface, providing a computing power of 42.5 Exaflops, approximately 24 times the computing power of El Capitan, the world's largest supercomputer. It can also handle the largest-scale parallel processing of artificial intelligence workloads, with each chip having a peak computing power of 4614 TFLOPS.
Google emphasizes that the TPU is the key core of its "AI Hypercomputer" integrated supercomputing system. The newly launched "Ironwood" boasts amazing scalability and system-level performance, and can eliminate data bottlenecks in traditional configurations through chip-to-chip interconnect interfaces with a transmission rate of up to 9.6Tb/s, allowing thousands of chips to work together like a single brain.
• Massive shared memory:
This expansion enables shared access to up to 1.77 petabytes (PB) of HBM (High-bandwidth Memory). Google explains that this is like providing a record-breaking "shared workspace" for the AI superbrain, allowing the largest AI models to be fully loaded, significantly improving computational efficiency and reducing total cost of ownership (TCO).
• High reliability (OCS):
By incorporating Optical Circuit Switching (OCS) technology as a dynamic fabric, it can immediately reroute when an interruption is detected, ensuring uninterrupted critical AI services and providing the highest level of operational resilience.
In the supplementary information provided by Google, it was mentioned that Ironwood Pod can provide 118 times more FP8 ExaFLOPS of computing power than "the next competitor", demonstrating its powerful performance in AI-specific computing.
Anthropic's million-unit TPU order solidifies Google's position in AI infrastructure.
Anthropic's commitment to procure up to 100 million TPUs from Google Cloud is undoubtedly a significant endorsement of Google's AI infrastructure.
Google points out that its own models, including Gemini, Veo, Imagen, and Anthropic's Claude, are all trained and served on TPUs.
This collaboration also echoes Google Cloud's earnings report released last week, which emphasized the unprecedented demand for AI infrastructure (especially TPUs) as one of its main growth drivers.
Synchronously update Axion CPU; N4A instance enters preview mode.
Along with announcing its dedicated AI accelerator, Google also emphasized that agentic AI workflows require close collaboration between general-purpose CPUs and AI accelerators. To this end, Google also simultaneously updated its Arm architecture-based CPU product line:
• N4A Instance (Import Axion CPU Design):
Based on Google's fourth-generation N-series VMs (virtual machines)New Axion CPUApplication example N4A is currently in the preview stage.
• Efficiency:
The N4A is touted as offering twice the cost-effectiveness of current comparable x86-based VMs, along with an 80% improvement in power output per watt.
• C4A metal (bare chassis):
The first bare-metal application instance of the Axion processor, C4A metal, will soon be available for preview.
The long-term strategy of "system-level co-design"
Google emphasizes that the success of its AI infrastructure stems from a long-term strategy of "system-wide collaborative design"—meaning that model research, software, and hardware development are all conducted under the same roof. From creating the first TPU a decade ago, to giving birth to the Transformer architecture eight years ago, to today's deployment of advanced liquid cooling systems with 99.999% uptime at GigaWatt scale, all demonstrate this continued strategy.





