Following the Blackwell architecture, NVIDIA officially announced at CES 2026 this year.A new generation AI computing platform codenamed "Rubin"It has already entered mass production. NVIDIA CEO Jensen Huang emphasized that the Rubin platform was created to meet the needs of the next generation of AI factories, especially for complex tasks such as agentic AI, hybrid expert models (MoE), and long-context reasoning. Through the so-called "Extreme Codesign", the Rubin platform can reduce the token generation cost of AI inference by as much as 10 times.
Six core chips: Vera CPU and Rubin GPU lead the way
The core of the Rubin platform consists of six brand-new chips, among which the most eye-catching are the Rubin GPU and Vera CPU. The Rubin GPU is built using TSMC's 3nm process and has a built-in third-generation Transformer Engine. Its NVFP4 AI inference performance reaches 50 PFLOPS, which is 5 times that of the previous generation Blackwell architecture, and its training performance is also improved by 3.5 times.
The Vera CPU was designed to work with powerful GPUs. NVIDIA emphasizes that this is an Arm architecture CPU designed for AI inference, featuring 88 custom Olympus cores. Compared to the previous Grace CPU, the Vera CPU offers double the performance and boasts a memory transfer bandwidth of up to 1.2TB/s, enabling more efficient handling of large-scale data throughput.
High-speed interconnect: NVLink 6 and Spectrum-6
To enable these chips to work together, NVIDIA introduced the NVLink 6 Switch, providing up to 3.6TB/s of bandwidth per GPU, which is crucial for training large-scale MoE models. For network transmission, there are ConnectX-9 SuperNIC and Spectrum-6 Ethernet switches, supporting end-to-end connection speeds of up to 800Gb/s, ensuring high-speed data flow within the AI factory.
DGX SuperPOD: The Infrastructure of an AI Factory
With the chip update, NVIDIA's supercomputer architecture DGX SuperPO has also ushered in a Rubin platform version.
• DGX Vera Rubin NVL72:This is a rack-level solution for extreme performance, integrating 8 systems in one rack, including 576 Rubin GPUs and 36 Vera CPUs, connected in series via NVLink 6, allowing the 576 Rubin GPUs to operate like a super-large GPU with a unified memory space, making it particularly suitable for processing very large models.
• DGX Rubin NVL8:For enterprises that require flexible deployment, NVL8 maintains a smaller liquid cooling specification and is paired with an x86 architecture CPU, allowing enterprises to more flexibly adopt the powerful computing power of Rubin computing.
Storage and Cybersecurity: Powered by BlueField-4 DPU
To address the key-value cache bottleneck during large model inference, NVIDIA introduced the Inference Context Memory Storage Platform based on the BlueField-4 DPU. This technology allows multiple GPUs to share context memory at high speed, improving inference speed and energy efficiency by 5 times.
In terms of cybersecurity, the Rubin platform also integrates cybersecurity solutions from partners such as Armis, Check Point, and F5, and uses BlueField DPU for real-time hardware acceleration protection to ensure the security of AI workloads.
Ecosystem support: All cloud giants join the effort
The NVIDIA Rubin platform has gained widespread support in the industry. Major cloud providers including Microsoft, AWS, Google Cloud, and Oracle have all announced their adoption of the Rubin system.
Microsoft will deploy the Vera Rubin NVL72 system in its next-generation "Fairwater" AI Gigafactory; CoreWeave, which focuses on AI computing power, will also be one of the first adopters.





