• Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
2026 / 04 / 13 20:56 Monday
  • Login
mashdigi-Technology, new products, interesting news, trends
  • Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
No Result
View All Result
  • Topics
  • Artificial wisdom
  • Autopilot
  • network
  • Processor
  • 手機
  • exhibition activities
    • CES
      • CES 2014
      • CES 2015
      • CES 2016
      • CES 2017
      • CES 2018
      • CES 2019
      • CES 2020
    • MWC
      • MWC 2014
      • MWC 2015
      • MWC 2016
      • MWC 2017
      • MWC 2018
      • MWC 2019
    • Computex
      • Computex 2014
      • Computex 2015
      • Computex 2016
      • Computex 2017
      • Computex 2018
      • Computex 2019
    • E3
      • E3 2014
      • E3 2015
      • E3 2016
      • E3 2017
    • IFA
      • IFA 2014
      • IFA 2015
      • IFA 2016
      • IFA 2017
    • TGS
      • TGS 2016
  • About us
    • About mashdigi
    • mashdigi website contact details
No Result
View All Result
mashdigi-Technology, new products, interesting news, trends
No Result
View All Result
This is an advertisement.
Home exhibition

Ending "Context Inflation" and "Thinking Tax"! NVIDIA Unveils Nemotron 3 Super Open Model Designed Specifically for Agent-Based AI

NVIDIA once again demonstrated its strength in hardware and software integration, showcasing to the industry the engine specifications that a next-generation AI agent should have.

Author: Mash Yang
2026-03-12
in exhibition, Market dynamics, Life, network, software
A A
0
Share to FacebookShare on TwitterShare to LINE

As enterprises shift from simple chatbots to multi-agent systems, underlying AI models are facing unprecedented performance and cost challenges. To address these pain points, NVIDIA announced the new Nemotron 3 Super model, an open-weighted model with 1200 billion parameters and a hybrid expert (MoE) architecture. By providing a massive context window of up to 100 million tokens and being deeply optimized for NVIDIA's next-generation Blackwell architecture computing platform, Nemotron 3 Super not only increases data throughput by 5 times but also accurately solves the problems of "context inflation" and "thinking tax" in complex agent workflows.

Ending "Context Inflation" and "Thinking Tax"! NVIDIA Unveils Nemotron 3 Super Open Model Designed Specifically for Agent-Based AI

Two major constraints on proxy AI: Context inflation and the tax on thinking

Why are existing large language models (LLMs) struggling to handle complex proxy tasks? NVIDIA points out two major development bottlenecks currently facing enterprises:

• First of allContext inflationIn collaborative workflows involving multiple AI agents, the system must continuously exchange complete historical records, tool outputs, and intermediate reasoning processes. This results in the generation of more than 15 times the number of lexical units compared to typical conversational interactions. This massive amount of data not only drives up computational costs but also frequently causes AI to "forget" or deviate from its original goal when handling lengthy tasks.

• followed by"Thinking Tax"A competent autonomous agent must perform deep reasoning at every step of the task execution. However, if every tiny subtask requires calling a massive model with hundreds of billions of parameters, the application will run extremely slowly and be prohibitively expensive, making it impossible to deploy on a large scale in an enterprise environment.

Hybrid architecture unleashes its power: the ultimate performance of Mamba combined with Transformer

To address these issues, Nemotron 3 Super features a massive context window of 1 million tokens, allowing agents to retain the complete workflow state in memory. At the underlying architecture level, NVIDIA has gone all out, introducing three key innovations that result in a 5x increase in data throughput and a 2x increase in accuracy compared to its predecessor:

This is an advertisement.

• Hybrid Architecture:Breaking away from the myth of a single architecture, Nemotron 3 Super cleverly combines two neural networks. The Mamba layer provides up to 4 times the memory and computational efficiency (especially suitable for processing very long texts), while the traditional Transformer layer drives complex high-order inference.

• Advanced hybrid expert models and potential hybrid expert models:Although the model has a total of 120 billion parameters, only 1200 billion active parameters are activated at a time during the inference phase, significantly reducing the computational burden. Even more groundbreaking is the "Latent Hybrid Expert Model" (Latent MoE) technology, which can predict the next word with "the computational cost of one expert, while activating four expert models" during inference, thereby squeezing out higher accuracy without increasing computing power.

• Multi-Token Prediction: Breaking the previous limitation of only being able to utter one word at a time, the model can simultaneously predict multiple future words, increasing the overall reasoning speed by 3 times.

Optimized for the Blackwell architecture, fully open source to support the ecosystem.

Beyond its innovative software architecture, Nemotron 3 Super is a powerful demonstration of NVIDIA's capabilities specifically designed for the Blackwell GPU platform. On the Blackwell architecture platform, this model can run in the extremely low-precision NVFP4 format, making its inference speed up to four times faster than the previous generation Hopper architecture platform (running in FP8), without sacrificing accuracy.

In terms of open source attitude, NVIDIA has been extremely generous this time. The Nemotron 3 Super not only releases open weights with a permissive licensing approach, but also completely discloses its training dataset of more than 10 trillion words, 15 reinforcement learning environments, and complete evaluation process research methods.

Currently, companies including Perplexity, Amdocs, Palantir, Dassault Systèmes, and Siemens have begun deploying the Nemotron 3 Super model to drive internal software development or vertical domain automation. Enterprise developers can access this NVIDIA NIM microservice starting today through build.nvidia.com, Hugging Face, or major public cloud platforms such as Google Cloud, Oracle, and Microsoft Azure.

Analysis of viewpoints

The launch of the Nemotron 3 Super once again proves that NVIDIA is not just a "hardware company that sells chips".

While OpenAI and Anthropic were still arguing over subscription fees for closed-source models, NVIDIA chose a completely different strategy: "Give you the best software and models for free, as long as you continue to buy my hardware."

The most formidable aspect of the Nemotron 3 Super lies in its "complete optimization for NVIDIA's own hardware." By addressing the memory consumption issues associated with long texts through a hybrid "Mamba + Transformer" architecture, and leveraging the precision of NVFP4 to capitalize on the computing power of Blackwell GPUs, NVIDIA is essentially setting a standard for the integrated hardware and software of future "Agentic AI." The complete release of its 10-megabyte training dataset is a bombshell for the open-source community and will significantly accelerate the transition of enterprise-level AI agents from the lab to real-world production lines.

However, the real game-changer might be the rumored AI agent application that will be announced at GTC 2026, focusing on enterprise-level AI agent applications.NVIDIA's version of the green lobster, "NemoClaw"This technology could potentially break down hardware silos, allowing enterprises to seamlessly integrate even if their underlying AI chips don't use NVIDIA's proprietary chips. It appears this technology is already being rolled out to enterprise software giants such as Salesforce, Cisco, Google, Adobe, and CrowdStrike, with specific details expected to be revealed at GTC 2026.

Tags: agentic AIAI AgentAI AgentBlackwellGTCGTC 2026NemoClawNemotron 3 SuperNvidiaAgent-based AIMixed Expert Modellobster
ShareTweetShare
Mash Yang

Mash Yang

Founder and editor of mashdigi.com, and student of technology journalism.

Leave a Reply Cancel Reply

The email address that must be filled in to post a message will not be made public. Required fields are marked as *

This site uses Akismet service to reduce spam.Learn more about how Akismet processes website visitor comments.

Translation (Tanslate)

Recent updates:

Apple's WWDC 2026 Swift Student Challenge: A record-breaking 8 students from Taiwan were shortlisted, showcasing "creativity is king" in generative AI-assisted development.

Apple's WWDC 2026 Swift Student Challenge: A record-breaking 8 students from Taiwan were shortlisted, showcasing "creativity is king" in generative AI-assisted development.

2026-04-13
FTC warns Google about political bias in Gmail spam filtering

Enterprise email security upgraded again! Google announces that it will roll out end-to-end encryption in Gmail to Android and iOS mobile platforms.

2026-04-13
Caught between AI production capacity competition and geopolitical conflicts, Qualcomm is reportedly collaborating with Changxin Memory to develop customized DRAM, aiming to safeguard the cost of mid-to-low-end smartphones.

Caught between AI production capacity competition and geopolitical conflicts, Qualcomm is reportedly collaborating with Changxin Memory to develop customized DRAM, aiming to safeguard the cost of mid-to-low-end smartphones.

2026-04-13
mashdigi-Technology, new products, interesting news, trends

Copyright © 2017 mashdigi.com

  • About mashdigi.com
  • Place ads
  • Contact mashdigi.com

Follow us

Welcome back!

Login to your account below

Forgotten Password?

Retrieve your password

Hãy nhập tên người dùng hoặc địa chỉ email để mở mật khẩu

Log In
No Result
View All Result
  • About mashdigi.com
  • Place ads
  • Contact mashdigi.com

Copyright © 2017 mashdigi.com