OpenAI earlier unveiled its first new AI model running on a Cerebras Systems chip.GPT-5.3-Codex-SparkThis marks the first time OpenAI has moved the computing foundation of its products off NVIDIA chips, signifying the start of its strategy to diversify its chip supply chain and giving a strong boost to AI accelerators outside the NVIDIA camp.
Featuring "lightning-fast reasoning," this lightweight model is designed specifically for writing programs.
The newly released GPT-5.3-Codex-Spark is a lightweight version of Codex, a code automation tool under OpenAI. Its design is not aimed at pursuing the ultimate performance in complex calculations, but rather at "efficiency".
For software engineers, the most important aspect of an AI programming assistant is its "instant response." GPT-5.3-Codex-Spark allows developers to quickly complete routine tasks such as modifying code and performing tests, and can even interrupt the current job at any time to assign new tasks, significantly reducing the time spent waiting for AI to generate results.
Behind all this speed is the collaboration between OpenAI and AI chip startup Cerebras Systems last month.A multi-billion dollar contractThe model currently runs on Cerebras Systems' flagship Wafer Scale Engine 3 (WSE-3), a massive AI accelerator built specifically for high-speed inference.
Major overhaul of the underlying pipelines reduced latency by 80%.
To complement the hardware architecture of Cerebras Systems, OpenAI not only optimized for the new chip but also significantly improved the overall inference pipeline. These underlying upgrades resulted in substantial performance improvements:
• Round-trip delay reduced by 80%The communication speed between the client and the server has been greatly improved.
• First-character output time (TTFT) reduced by 50%:The reaction time for AI to output its first piece of code is halved.
• Cost per token reduced by 30%:The computational cost has decreased significantly.
• Improve WebSocket connection time:Enabled by default to ensure the stability and immediacy of the conversation.
Currently, GPT-5.3-Codex-Spark is still a plain text model with a 128K context window and does not yet support image or multimodal input. This model is currently available as a "research preview" for ChatGPT Pro subscribers, with wider availability expected in the coming weeks.
While actively expanding its network of partners, NVIDIA remains the "main player."
This collaboration represents a significant breakthrough for Cerebras Systems, breaking through NVIDIA's long-standing market dominance. For OpenAI, it's just the latest development in a series of recent moves to "diversify vendor risk."
Back in October of last year, OpenAI had already reached a multi-year agreement with AMD, with plans to deploy up to 6 GW of GPU computing power. In the same month, it also signed a contract with Broadcom to develop custom ASICs and network components.
However, in response to rumors of a strained relationship between OpenAI and NVIDIA, OpenAI officials quickly stepped in to quell the rumors. A spokesperson emphasized that the partnership with NVIDIA is "foundational" and reiterated that NVIDIA's hardware remains central to OpenAI's training and inference architecture. The introduction of Cerebras Systems, AMD, and Broadcom chips is purely for "expanding the ecosystem."



