As generative AI increasingly permeates enterprise workflows, OpenAI has officially launched a new foundational model hailed as "designed specifically for professional work."GPT-5.4And the GPT-5.4 Pro, which pursues ultimate performance. This upgrade no longer only pursues the naturalness of chatting with humans, but focuses entirely on code writing, data analysis, and agentic workflows.
GPT-5.4 is not only OpenAI's first general-purpose model with "native computer operation capabilities", but it also shows overwhelming progress in spreadsheet processing and presentation generation.
As key partners like Microsoft begin to incorporate models from other competitors, OpenAI has clearly realized that to truly gain a foothold in the enterprise market, its models must be able to actually "get the job done," not just "provide suggestions."
AI directly controls your mouse and keyboard: native computer operation capabilities
One of the biggest technological breakthroughs of GPT-5.4 is that it is OpenAI's first general-purpose model with native, state-of-the-art "computer-use capabilities".
In the past, AI could mostly only generate code or steps within text boxes, but GPT-5.4 allows AI agents to directly operate the computer and execute complex workflows across multiple applications. According to OpenAI data, in the OSWorld-Verified benchmark test that tests desktop navigation capabilities, GPT-5.4 achieved a success rate of 75.0%, far surpassing the 47.3% of its predecessor, GPT-5.2, and even exceeding the human benchmark performance of 72.4%, meaning it can react to screen screenshots and accurately issue mouse and keyboard commands.
Saving Office Workers: A Complete Evolution of Spreadsheets, Presentations, and "Thinking Patterns"
For professionals who rely on word processing software, GPT-5.4 offers a significant upgrade.
• Investment banking level capability calculation table:In an internal test simulating a junior investment bank analyst's spreadsheet modeling task, GPT-5.4 scored 87.3%, significantly outperforming GPT-5.2's 68.4%.
• Generate more aesthetically pleasing presentations:When evaluating presentation generation, human raters were 68.0% more likely to prefer GPT-5.4 outputs due to their superior aesthetic design, greater visual variation, and more effective use of image generation tools.
• A transparent "Thinking Mode":In ChatGPT, the new GPT-5.4 Thinking mode provides a "thinking plan" in advance, allowing users to "adjust direction at any time" during the response generation process, ensuring that the final output better meets the requirements and reducing the number of revisions.
• Significantly reduces hallucinations:GPT-5.4 is OpenAI's most fact-accurate model to date. Compared to its predecessor, it reduces the probability of a single statement being incorrect by 33% and the probability of the overall response containing errors by 18%.
Supports millions of token contexts and "tool search" for cost reduction and efficiency improvement
On the developer side, GPT-5.4 supports context lengths of up to 100 million tokens, enabling AI agents to plan, execute, and verify tasks over extremely long time spans.
Even more noteworthy is the new "Tool search" feature introduced in the API.
Previously, when a model was equipped with multiple tools, all tool definitions had to be stuffed into the prompt, resulting in huge token consumption. Now, GPT-5.4 can dynamically query the required tool definitions. This change successfully reduced total token usage by 47% in the MCP Atlas benchmark test while maintaining the same accuracy.
Regarding pricing and availability, paid users of ChatGPT Plus, Team, and Pro versions can use GPT-5.4 Thinking starting today, replacing the existing GPT-5.2 Thinking. As for API pricing, while its token efficiency is higher, the unit price has also increased: GPT-5.4 costs $2.5 per million input tokens (higher than GPT-5.2's $1.75), and outputs cost $15.



