GPT-5.4: OpenAI Launches Its Most Powerful Frontier Model

OpenAI just made a big move. GPT-5.4 is available today in ChatGPT (under the name GPT-5.4 Thinking), in the API, and in Codex. It is the most capable and efficient frontier model OpenAI has ever built for professional work. GPT-5.4 Pro is also available for those who want maximum performance on the most demanding tasks.

The key takeaways in 30 seconds

GPT-5.4 combines advanced reasoning, state-of-the-art coding, and agentic workflows in a single model. 1 million token context window, native computer use, intelligent tool search, and benchmarks that surpass human performance on certain tasks. Available for Plus, Team, and Pro subscribers.

What GPT-5.4 actually changes

GPT-5.4 brings together the best of OpenAI's recent advances into a single model. It integrates the coding capabilities of GPT-5.3-Codex while dramatically improving work with tools, software environments, and professional tasks involving spreadsheets, presentations, and documents.

The result: a model that handles complex work with precision and efficiency, delivering what you asked for with fewer back-and-forth exchanges. No more asking three times to get the right spreadsheet format or the correct layout.

1 million tokens: memory that finally matches the ambition

GPT-5.4 supports up to 1 million context tokens, more than double the 400,000 tokens of GPT-5.2. In practice, the model can ingest entire codebases, complete documentation libraries, or lengthy conversation histories without losing track.

This extended memory comes with much better retention: GPT-5.4 remembers your instructions and context across long sessions. Forgotten directives after 20 messages are a thing of the past. For developers using Codex, this is a major shift: the model can plan, execute, and verify tasks across long sequences.

'Extreme' reasoning: the xhigh mode

GPT-5.4 introduces a new reasoning level called xhigh. This mode allocates significantly more resources to thinking before responding -- a slow compute strategy that proves decisive for specialized topics, complex analyses, and multi-step tasks.

In ChatGPT, GPT-5.4 Thinking can now present an upfront thinking plan, allowing you to adjust its direction mid-course while it works. You get a final result more aligned with your expectations without having to restart the conversation.

Computer Use: GPT-5.4 controls your computer

This is the most striking new capability. GPT-5.4 is OpenAI's first generalist model with native computer use abilities. It can browse the web, fill out forms, send emails, interact with user interfaces -- all by interpreting screenshots and sending keyboard/mouse commands.

On OSWorld-Verified, which measures a model's ability to navigate a desktop environment, GPT-5.4 achieves a 75.0% success rate, shattering GPT-5.2's 47.3% and surpassing the human baseline of 72.4%. We are talking about a model that is literally better than the average human at using a computer via screenshots.

Benchmark	GPT-5.4	GPT-5.2	Human
OSWorld-Verified (desktop)	75.0%	47.3%	72.4%
WebArena-Verified (browser)	67.3%	65.4%	-
Online-Mind2Web (browser)	92.8%	-	-

GPT-5.4 computer use performance

Professional work: spreadsheets, presentations, documents

OpenAI placed particular emphasis on improving GPT-5.4's ability to create and edit spreadsheets, presentations, and documents. On an internal benchmark of spreadsheet modeling tasks (junior analyst level in investment banking), GPT-5.4 scores 87.3%, compared to 68.4% for GPT-5.2.

For presentations, human evaluators preferred GPT-5.4's slides in 68% of cases over GPT-5.2, thanks to better aesthetics, more visual variety, and more effective use of image generation.

On GDPval, which tests agent capabilities on real-world work tasks across 44 professions, GPT-5.4 sets a new record: it matches or outperforms industry professionals in 83% of comparisons, up from 70.9% for GPT-5.2.

Fewer hallucinations, greater accuracy

GPT-5.4 is OpenAI's most factual model to date. On a set of queries where users had flagged factual errors, GPT-5.4's individual claims are 33% less likely to be false and its complete responses are 18% less likely to contain any errors, compared to GPT-5.2.

Coding: merging GPT-5.3-Codex strengths

GPT-5.4 merges the coding capabilities of GPT-5.3-Codex with its own strengths in reasoning and computer use. It matches or surpasses GPT-5.3-Codex on SWE-Bench Pro (57.7% vs 56.8%) while being faster at every reasoning level.

The /fast mode in Codex delivers up to 1.5x token generation speed with GPT-5.4. Same model, same intelligence, just faster. OpenAI also notes that the model excels at complex frontend tasks, producing more visually polished results than anything they have shipped before.

Tool Search: managing thousands of tools intelligently

GPT-5.4 introduces Tool Search, a game-changing feature for agentic workflows. Previously, all tool definitions were included in the prompt, which could add tens of thousands of tokens to each request. With Tool Search, the model receives a lightweight list of available tools and only loads the full definition when it actually needs it.

The result on the MCP Atlas benchmark with 36 MCP servers: 47% fewer tokens for the same accuracy. For MCP servers with tens of thousands of tokens in tool definitions, the savings are substantial.

Detailed benchmarks

Benchmark	GPT-5.4	GPT-5.4 Pro	GPT-5.2
GDPval (professional work)	83.0%	82.0%	70.9%
SWE-Bench Pro (coding)	57.7%	-	55.6%
OSWorld (computer use)	75.0%	-	47.3%
BrowseComp (web search)	82.7%	89.3%	65.8%
Toolathlon (tools)	54.6%	-	45.7%
ARC-AGI-2 (reasoning)	73.3%	83.3%	52.9%
GPQA Diamond (science)	92.8%	94.4%	92.4%
Humanity's Last Exam	52.1%	58.7%	45.5%

GPT-5.4 vs GPT-5.2 performance on key benchmarks

Pricing and availability

GPT-5.4 Thinking is available today for ChatGPT Plus, Team, and Pro subscribers, replacing GPT-5.2 Thinking. The latter will remain accessible for 3 months in the Legacy Models section before being retired on June 5, 2026. GPT-5.4 Pro is reserved for Pro and Enterprise plans.

API Model	Input price	Cached input	Output price
gpt-5.2	$1.75 / M tokens	$0.175 / M tokens	$14 / M tokens
gpt-5.4	$2.50 / M tokens	$0.25 / M tokens	$15 / M tokens
gpt-5.4-pro	$30 / M tokens	-	$180 / M tokens

GPT-5.4 API pricing

GPT-5.4 costs more per token than GPT-5.2, but its greater token efficiency reduces the total number of tokens needed for many tasks. Batch and Flex pricing is available at half price.

What this means for ChatGPT users

For the everyday ChatGPT user, GPT-5.4 delivers three major improvements: more accurate responses with fewer hallucinations, better context tracking across long conversations, and the ability to see and adjust the model's thinking plan as it works.

For developers and professionals, computer use and Tool Search are the game changers. The ability to build agents that browse the web, fill out forms, and chain complex tasks autonomously opens possibilities that were previously limited to custom-built solutions.

GPT-5.2 Thinking will be retired on June 5, 2026. If you have workflows or API integrations based on this model, plan your migration to GPT-5.4 in the coming weeks.

The model race is not slowing down

With GPT-5.4, OpenAI is directly responding to competitive pressure. Anthropic's Claude is advancing in reasoning and coding, Google's Gemini is pushing on multimodal and long context, and DeepSeek continues to surprise on efficiency. This launch is clearly an attempt to reclaim ground lost in recent months.

The real question remains one of sustainability. GPT-5.4 is impressive today, but in a market where a new frontier model ships every week, how long will these benchmarks stay on top?

Stay up to date on AI news

Get the latest updates on AI models, launches, and the innovations that matter.

No spam. Unsubscribe in 1 click.

GPT-5.4: OpenAI Launches Its Most Powerful Model with 1 Million Tokens and Native Computer Use

What GPT-5.4 actually changes

1 million tokens: memory that finally matches the ambition

'Extreme' reasoning: the xhigh mode

Computer Use: GPT-5.4 controls your computer

Professional work: spreadsheets, presentations, documents

Fewer hallucinations, greater accuracy

Coding: merging GPT-5.3-Codex strengths

Tool Search: managing thousands of tools intelligently

Detailed benchmarks

Pricing and availability

What this means for ChatGPT users

The model race is not slowing down

Stay up to date on AI news

Related articles

AWS Bets $58 Billion on OpenAI and Anthropic: The Cloud-AI War Heats Up

ChatGPT vs Claude vs Gemini vs Mistral: Which AI Model Should You Choose in 2026?

data.gouv.fr Launches Its MCP Server: When AI Talks to French Public Data

Ready to discover the best AI tools?

Stay informed about the latest AI news

GPT-5.4: OpenAI Launches Its Most Powerful Model with 1 Million Tokens and Native Computer Use

What GPT-5.4 actually changes

1 million tokens: memory that finally matches the ambition

'Extreme' reasoning: the xhigh mode

Computer Use: GPT-5.4 controls your computer

Professional work: spreadsheets, presentations, documents

Fewer hallucinations, greater accuracy

Coding: merging GPT-5.3-Codex strengths

Tool Search: managing thousands of tools intelligently

Detailed benchmarks

Pricing and availability

What this means for ChatGPT users

The model race is not slowing down

Is GPT-5.4 available for free?

What is the difference between GPT-5.4 and GPT-5.4 Pro?

What happens to GPT-5.2?

Is computer use available in ChatGPT?

Stay up to date on AI news

Related articles

AWS Bets $58 Billion on OpenAI and Anthropic: The Cloud-AI War Heats Up

ChatGPT vs Claude vs Gemini vs Mistral: Which AI Model Should You Choose in 2026?

data.gouv.fr Launches Its MCP Server: When AI Talks to French Public Data

Ready to discover the best AI tools?

Stay informed about the latest AI news