• Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
2026 / 04 / 15 14:15 Wednesday
  • Login
mashdigi-Technology, new products, interesting news, trends
  • Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
No Result
View All Result
  • Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
No Result
View All Result
mashdigi-Technology, new products, interesting news, trends
No Result
View All Result
This is an advertisement.
Home exhibition

NVIDIA further explains the details of the "Blackwell" display architecture, maintaining computing flexibility while achieving higher performance output.

Author: Mash Yang
2024-03-20
in exhibition, Market dynamics, Hard body, Processor, Topics
A A
0
Share to FacebookShare on TwitterShare to LINE

Regarding this announcementBlackwell display architectureAfter the meeting, NVIDIA explained the details of this display architecture and announced the launch of three acceleration computing element designs: B100, B200 and GB200 Superchip.

NVIDIA further explains the details of the "Blackwell" display architecture, maintaining computing flexibility while achieving higher performance output.
▲The GB200 Superchip is composed of a single "Grace" CPU and two "Blackwell" GPUs

NVIDIA CEO Jensen Huang explained that the "Blackwell" display architecture was created by challenging the limits of physics while balancing actual performance and cost.

The "Blackwell" display architecture is designed for the needs of AI with tera-scale parameters. It is produced using TSMC's customized 4nm process and can achieve 20 PetaFLOPS of computing power through a single GPU design. The Superchip integrated with this GPU design contains 2080 billion transistors, which can increase training efficiency by 4 times, inference computing efficiency by 30 times, and energy utilization efficiency by 25 times compared to the previously launched "Hopper".

In terms of architecture, "Blackwell" integrates the second-generation Transformer artificial intelligence engine, the Tensor Core design that can support FP4/FP6 low-bit floating-point operations, and supports the fifth-generation NVLink connection technology. It can connect with up to 576 GPUs simultaneously, supports a data decompression rate of up to 800GB per second, and a more secure data encryption protection mechanism to ensure operational stability.

In addition, "Blackwell" also has a special design with two sets of masks corresponding to the die core unit. The internal communication is carried out using the NVLink-HBI interface with a data transmission rate of 10TB per second, and it can operate as a single GPU.

NVIDIA CEO Jensen Huang explained that the "Blackwell" graphics architecture was created by challenging the limits of physics while simultaneously considering the effective balance between performance and cost. Therefore, combining two sets of die core units into a single GPU clearly strikes a balance between existing process technology yield and manufacturing costs, while also enhancing the computing performance of the "Blackwell" graphics architecture through stacking.

NVIDIA further explains the details of the "Blackwell" display architecture, maintaining computing flexibility while achieving higher performance output.
▲The "Blackwell" display architecture design shows that the same data transfer rate allows the entire "GPU" to accelerate computing.

「Blackwell」在FP8運算模式可對應10 PetaFLOPS算力表現,而在FP4運算模式則可對應20 PetaFLOPS算力表現,本身則整合192GB容量、支援每秒8TB資料傳輸量的HBM3e高密度記憶體,並且能透過NVLink以每秒1.8TB速率交換資料內容。

To further enhance Blackwell's efficiency in multi-mode AI applications, NVIDIA also provides data transfer rates of up to 100 GBytes per second through the HDR Infiniband transmission interface. This allows for synchronization of computing data between every 15 GPUs in a large-scale computing cluster. Combined with the fifth-generation NVLink design, this allows computing nodes comprising up to 576 GPUs to maintain accurate computational accuracy.

This is an advertisement.

Launched three acceleration computing element designs: B100, B200 and GB200 Superchip

The current "Blackwell" display architecture is used to create accelerated computing element designs, which are divided into B100, B200, and a combination of a single "Grace" CPU and two "Blackwell" GPUs.GB200 Superchip.

Among them, B100 and B200 are both equipped with HBM192e high-density memory with a total capacity of 3GB, corresponding to a data transmission rate of 8TB per second. At the same time, it is the same as the data transmission rate of the GPU itself, so it can correspond to faster data processing efficiency in the display architecture.

The biggest difference between the B100 and B200 lies in their operating power consumption. The former has a maximum power consumption of 700W and can operate via air cooling. It can also be used directly in the HGX rack space corresponding to the H100 acceleration element design. The latter generally consumes 1000W of power and can still operate via air cooling, but whether it can be used in the existing corresponding H200 rack space depends on the situation. As for further increasing the power consumption to 1200W, it must be operated with water cooling, so the corresponding rack must be redesigned.

NVIDIA further explains the details of the "Blackwell" display architecture, maintaining computing flexibility while achieving higher performance output.
▲Different performance outputs can be matched through power consumption and combination differences

The GB200 Superchip is designed for AI training acceleration and operates in a fully water-cooled form.

The GB200 Superchip must be fully liquid-cooled, but this has the advantage of reducing the need for space-consuming heat sinks and maintaining operational stability through the water cooling system. Compared to the DGX H10.2 system, which consumes 8kW of power and has an 100U rack design, the space occupied is reduced to one-eighth while maintaining similar computing performance. The water cooling system also reduces the space required for heat exchange and reduces noise levels during operation.

Based on the H100 computing power, the GB200 Superchip has 6 times the computing power, which can process approximately 3 billion sets of GPT-1750 parameters. The corresponding computing power performance for processing multi-mode specific fields can reach 30 times, which can process up to 1.8 trillion parameters.

The GB36 NVL200, which connects 200 GB72 Superchips via NVLink, can achieve 720 PFLOPS of computing power during training and 1440 PFLOPS of computing power for inference. It can also support a parameter scale of 27 trillion groups, with a multi-node transmission bandwidth of 130TB per second and a maximum transmission volume of 260TB per second.

NVIDIA further explains the details of the "Blackwell" display architecture, maintaining computing flexibility while achieving higher performance output.
▲GB36 NVL200 is made by connecting 200 sets of GB72 Superchip in series

In addition, if eight GB8 NVL200s are connected in series, a DGX BG72 Superpod can be constructed, integrating 200 "Grace" CPUs and 288 "Blackwell" GPUs, and including 576TB of high-speed memory capacity. In FP240 computing mode, it can correspond to 4 ExaFLOPS computing power, and achieve 11.5 times the inference efficiency, 30 times the training efficiency, and 4 times the energy utilization efficiency.

Maintaining portfolio flexibility, but preferring Arm architecture portfolios under the development trend of artificial intelligence

Currently, NVIDIA maintains the flexibility of its "Blackwell" display architecture, offering the option of combining it with either an x86 or Arm CPU. The B100 is also compatible with existing H100 racks, and the B200 can also be used with existing racks in certain circumstances, maintaining its deployment and application upgrade flexibility while also significantly improving computing performance.

However, when it comes to AI deployment applications, NVIDIA states that the best combination is still the Arm architecture CPU. This is mainly due to the limitations of the x86 architecture CPU's corresponding I/O port and other channel designs, as well as the upper limit on the number of connections that NVLink can support. In addition, the use of the x86 architecture CPU also requires additional cooling system construction. Therefore, for AI inference and other training purposes, the combination with the "Grace" CPU will still be the main recommendation.

NVIDIA further explains the details of the "Blackwell" display architecture, maintaining computing flexibility while achieving higher performance output.
▲Increasing the number of GPUs that can be connected simultaneously through NVLink makes artificial intelligence training faster
Tags: B100B200BlackwellGB200 NVL72GB200 SuperchipGTCGTC 2024Nvidia
ShareTweetShare
Mash Yang

Mash Yang

Founder and editor of mashdigi.com, and student of technology journalism.

Leave a Reply Cancel Reply

The email address that must be filled in to post a message will not be made public. Required fields are marked as *

This site uses Akismet service to reduce spam.Learn more about how Akismet processes website visitor comments.

Translation (Tanslate)

Recent updates:

Hands-on Test: Roborock's New Flagship Robotic Vacuum Cleaner, the Saros 20 Sonic, Delivers an Ultimate Cleaning Experience with Intelligent Lifting and Powerful Vibration Mopping.

Hands-on Test: Roborock's New Flagship Robotic Vacuum Cleaner, the Saros 20 Sonic, Delivers an Ultimate Cleaning Experience with Intelligent Lifting and Powerful Vibration Mopping.

2026-04-15
Targeting "personal super intelligence"! Meta expands its collaboration with Broadcom to create the 2nm customized AI chip MTIA, with initial deployment exceeding 1GW.

Targeting "personal super intelligence"! Meta expands its collaboration with Broadcom to create the 2nm customized AI chip MTIA, with initial deployment exceeding 1GW.

2026-04-15
The realme 16 series, featuring 2MP sensors and a 7000mAh battery, makes its debut: the first to use the LumaColor imaging system, and the 16 Pro challenging for the best portrait mode in its class.

Seize the Y2K retro selfie trend! realme 16 5G debuts, featuring a groundbreaking "fill light selfie mirror" combined with a massive 7000mAh battery.

2026-04-15
mashdigi-Technology, new products, interesting news, trends

Copyright © 2017 mashdigi.com

  • About mashdigi.com
  • Place ads
  • Contact mashdigi.com

Follow us

Welcome back!

Login to your account below

Forgotten Password?

Retrieve your password

Hãy nhập tên người dùng hoặc địa chỉ email để mở mật khẩu

Log In
No Result
View All Result
  • About mashdigi.com
  • Place ads
  • Contact mashdigi.com

Copyright © 2017 mashdigi.com