Following the announcements at GTC 2026 detailing the Vera Rubin platform, the BlueField-4 STX storage architecture, and space computing, NVIDIA CEO Jensen Huang further revealed the NVIDIA Accelerated Computing product roadmap up to 2028. This roadmap not only formally confirms the continued integration of the Groq LPU design but also indicates that the "Feynman" next-generation accelerated computing platform will further expand its optoelectronic converged architecture, thereby increasing the bandwidth required for AI inference.
The Rubin generation will incorporate the Groq LPU, with the Rubin Ultra slated for release in 2027.
According to the latest architecture roadmap announced by Jensen Huang at GTC 2026, after establishing a partnership with AI chip startup Groq, NVIDIA has officially incorporated Groq's LPU (Language Processing Unit) into its long-term product plan.
According to the plan, the Vera Rubin platform, which is expected to be officially shipped in the second half of 2026, will integrate Groq's LP30 LPU. The enhanced version Rubin Ultra, which is expected to be launched in 2027, will integrate the LP35 LPU, which will support NVFP4 precision. It is also expected to advance to the LP40 LPU design when entering the Feynman generation, and will be connected via NVLink.
Feynman Generation: Stacked Architecture, Custom HBM Design, Extended Optoelectronic Parallel Network Architecture
The next-generation architecture Feynman, expected to be launched in 2028, will be the first to adopt a stacked design on the die and integrate custom HBM high-bandwidth memory. It will also connect with the LP40 LPU using NVLink technology to achieve more efficient collaboration with the GPU.
In addition, the Feynman generation will also include a brand-new CPU called Rosa, paired with the next-generation BlueField-5 DPU. Following the Rubin generation's use of the Common Package Optical Component (CPO) design in the Spectrum 6 102T network chip, the Feynman generation will further extend this design to the NVLink 8 (eighth-generation NVLink) and the next Spectrum 7 204T network chip, thereby supporting higher data transmission bandwidth. The network controller chip will also be updated to the CX10.
However, Huang Renxun has not yet revealed whether the Feynman generation will also have an Ultra version, similar to the Rubin generation, but it is expected that there will be similar product plans, which is in line with the previously announced product development strategy of one update per year.
Rack strategy shift: Operan and Kyber run in parallel, with the latter taking on the heavy responsibility of massive scaling.
Regarding rack system planning, NVIDIA has also established a dual-track strategy for the future. According to the roadmap, with the Kyber rack system expected to launch in 2027, NVIDIA's rack-class product line will undergo strategic specialization.
Jensen Huang confirmed that the strategy of running the current Opera rack system and the next-generation Kyber rack system in parallel will be maintained. In the Feynman generation, which will be launched in 2028, the Kyber system will replace Opera as the main design for hyperscale scaling. It is expected to reach the NV144 scale from the Rubin generation and scale to the NVL1152 scale in the Feynman generation, thereby supporting the higher computing density of future AI factories.
Analysis: NVIDIA's "System-Level Warfare" and the Challenges of Competitors Catching Up
The product roadmap released by NVIDIA at GTC 2026 has a strategic significance that goes far beyond a simple chip specification upgrade. It symbolizes that NVIDIA's competitive dimension has officially upgraded from a "chip-to-chip" arms race to a "system-to-system" total war.
First, the strategic intent of acquiring Groq technology is clear: to use Groq's LPU technology to make up for its own shortcomings in the AI inference stage (compared to the accelerated computation of training, inference often requires more agile computation characteristics that do not consume excessive power and other costs), thereby providing a seamless experience from training to inference, which is extremely important for the market's next larger demand for proxy AI applications.
Secondly, the expanded optoelectronic fusion architecture announced by Feynman generation will be NVIDIA's key competitive advantage for the next decade. It will not only continue to improve the efficiency of AI model training by leveraging its own GPU acceleration computing advantage, but also enhance the inference efficiency by using Groq's LPU. Furthermore, it will achieve scale-out of overall computing power through optoelectronic fusion, while also putting Broadcom under competitive pressure in the co-packaged optical design application layout.



