NVIDIA unveiled its Spectrum-XGS Ethernet technology at the HOT Chips conference. This technology, leveraging algorithms derived from the Spectrum-X platform, enables data centers in different locations to operate as a single supernode through automated distance congestion control and latency management. NVIDIA claims that Spectrum-XGS can nearly double the performance of the NVIDIA Collective Communications Library (NCCL) for multi-GPU computing, significantly accelerating communication between GPUs and multiple nodes, delivering predictable, near-linear performance improvements for AI training and large-scale inference.
This means that the "supercomputer" power that was once confined to a single large data center can now transcend distance and building limitations, connecting multiple independent data centers into a single "giga-scale AI superfactory." CoreWeave, a company specializing in AI cloud infrastructure, will be one of the first partners to adopt Spectrum-XGS technology.
Building a cross-domain AI super factory
According to NVIDIA, Spectrum-XGS uses automatic distance congestion control and latency management technologies to precisely optimize communication efficiency between GPUs and multiple nodes, nearly doubling NCCL performance. For AI clusters spanning different cities or even regions, this means computing power can be treated as a single, ultra-large-scale computing pool, achieving predictable performance almost like a single data center.
In other words, distributed data centers, previously limited by physical distance, will be able to break through geographical boundaries through Spectrum-XGS connectivity in the future, becoming the core of cross-domain collaborative AI computing and delivering more flexible scalability.
Comparison with Broadcom's Ethernet technology
In the field of Ethernet switching technology, Broadcom has long beenPlay a central roleIts Tomahawk and Trident series ASICs (application-specific integrated circuits) are almost standard equipment in large data center switches. Broadcom's technological advantages lie in high port density, low power consumption, and a mature ecosystem, widely supporting the needs of cloud computing and telecommunications operators. However, while Broadcom's solutions can provide data exchange rates of up to hundreds of Tbps, they are still primarily oriented towards optimizing traditional network traffic and are not fully suitable for the highly synchronized GPU-to-GPU communication required during AI training.
In contrast, NVIDIA Spectrum-XGS is more clearly positioned for dedicated AI networks. Its algorithmic capabilities incorporate adaptive capabilities for distributed AI workloads, including automatic distance congestion control, cross-datacenter latency compensation, and tight integration with the NVIDIA hardware and software ecosystem, including NCCL and NVLink. This means that Spectrum-XGS goes beyond simply competing on port count or bandwidth, but is directly optimized for the communication needs of distributed AI model training.
In other words, if Broadcom Ethernet technology is the broad "highway network backbone" for data centers, then NVIDIA Spectrum-XGS is more like a "dedicated express lane" built specifically for AI computing. The former offers economies of scale and maturity, while the latter emphasizes reducing training time and improving cross-region performance predictability in the AI era. For companies investing heavily in the AI cloud, the two roles may not necessarily be replacements, but rather complementary ones, for example, using Broadcom Ethernet technology to create a universal connectivity backbone and NVIDIA to build an acceleration layer focused on AI computing.
As the scale of AI models continues to expand, future cloud data centers will increasingly embrace multi-domain, cross-distance, and massively collaborative approaches. The introduction of NVIDIA Spectrum-XGS not only demonstrates its commitment to integrating network hardware and software, but also signifies that AI infrastructure is evolving beyond the traditional data center framework, moving toward cross-regional integration and gigabit-class AI superfactories.







