AWS announces the launch of products featuring its own...3nm process AI chip Trainium3Amazon EC2 Trn3 UltraServers aims to address the cost and computing power bottlenecks currently faced in AI model training and inference. Official data shows that its computing performance is up to 4.4 times better than the previous generation design, and its energy efficiency is 40% higher. It also seeks to make the infrastructure needed to train large-scale AI models more affordable for more enterprises through a more competitive price.

A single unit is equipped with 144 Trainium 3 chips and has a latency of less than 10 microseconds.
The core of the Amazon EC2 Trn3 UltraServer lies in its highly integrated architecture, which can accommodate up to 144 Trainium 3 chips in a single system, providing up to 362 FP8 PFLOPs of AI computing power, and is officially launched today.


To address the communication bottlenecks in distributed computing, AWS has introduced the new NeuronSwitch-v1 and enhanced Neuron Fabric networking technology, reducing inter-chip communication latency to below 10 microseconds (μs), which is especially important for agentic AI or hybrid expert models (MoEs) that require large amounts of data flow.
AWS further points out that through the EC2 UltraClusters 3.0 design, customers can connect thousands of UltraServers servers, scaling them to a massive computing cluster with 100 million Trainium chips, which is 10 times the size of the previous generation.


Decart and Anthropic were the first to adopt this technology, reducing costs by 50%.
In real-world application examples, AWS cited the effectiveness of several partner customers.
Decart, specializing in generative AI videos, stated that using Trainium 3 for real-time video generation resulted in inference speeds up to four times faster, while costing only half the price of GPU-accelerated computation. Customers such as Anthropic, Karakuri, and Ricoh have also successfully reduced training and inference costs by up to 50% through the computing power of Trainium chips.
On the other hand, AWS also revealed its collaboration with Anthropic."Project Rainier"According to the latest update, the project has connected more than 50 Trainium2 chips, making it one of the world's largest AI computing clusters, five times the size of the previous generation model trained by Anthropic.

The announcement indicates that Trainium 4 will support NVIDIA NVLink Fusion interconnect technology, breaking down barriers between different camps.
Looking ahead, AWS confirmed that it is developing the next-generation Trainium 4 chip, which is expected to offer a 6x performance increase (in FP4 operations), a 4x increase in memory bandwidth, and a 2x increase in memory capacity. Furthermore, it stated that this chip will support…NVIDIA NVLink Fusion high-speed interconnect technology.
This means that in the future, Trainium4 and Graviton processors will be able to work seamlessly with NVIDIA GPUs in the common MGX rack, breaking down the clear boundaries between self-developed chips and the GPU camp, and providing customers with more flexible hybrid architecture options.








