OpenAI GPT-5.4
ClosedMultimodalOpenAI
OpenAI's flagship frontier model with perfect AIME 2025 score, leading agentic capabilities, and multiple size variants including Standard, Pro, Mini, and Nano.
Overview
GPT-5.4, released on March 5, 2026, is OpenAI's flagship frontier model and the culmination of their mixture-of-experts architecture scaling. It features a 1.05M token context window (922K input + 128K output), the largest usable context among closed-source competitors at launch. The model comes in five variants — Standard, Thinking, Pro, Mini, and Nano — spanning from a $180/month Pro subscription tier down to the lightweight Nano designed for on-device deployment.
The headline benchmark result is a perfect 100% on AIME 2025, the first model to achieve a flawless score on the American Invitational Mathematics Examination. But arguably more significant is its OSWorld performance: 75% versus the human expert baseline of 72.4%, marking the first time an AI model has surpassed human-level performance on real-world computer use tasks including web browsing, file management, and application interaction.
GPT-5.4's architecture uses a mixture-of-experts approach with a knowledge cutoff of August 2025. The Thinking variant introduces extended reasoning with a notable trade-off: time-to-first-token can reach 153 seconds on complex problems as the model works through its chain-of-thought before generating output. The Pro variant ($30 input / $180 output per MTok) unlocks the deepest reasoning capabilities, while Mini ($0.40/$1.60) offers a compelling balance for production workloads.
One practical consideration is cached input pricing at $1.25/MTok, which makes repeated prompts with shared prefixes substantially cheaper. The model scores well across the board — 92.8% GPQA Diamond, 57.7% SWE-bench Pro, and a tied-#1 AA Intelligence Index of 57 — positioning it as a strong generalist that particularly excels at agentic computer use and mathematical reasoning.
Release Date
2026-03-05
Parameters
Unknown (MoE)
Context Window
1.1M tokens
Input Price
$2.5 / 1M tokens
Output Price
$15 / 1M tokens
Speed
83.5 tokens/sec
Benchmarks
| Benchmark | Score | Max |
|---|---|---|
| AIME 2025 | 100% | 100% |
| GPQA Diamond | 92.8% | 100% |
| AA Index | 57% | 100% |
| SWE-bench Pro | 57.7% | 100% |
| Arena ELO | 1484 | 2000 |
| HLE | 44.3% | 100% |
Capabilities
Mathematical Reasoning
Perfect 100% on AIME 2025, first model to achieve a flawless score on this competition-level math exam
Agentic Computer Use
75% on OSWorld surpassing human expert baseline (72.4%), handling real web browsing and app interaction
Science Reasoning
GPQA Diamond 92.8%, strong across physics, chemistry, and biology graduate-level questions
Extended Context
1.05M token window (922K input + 128K output) with cached input at $1.25/MTok
Model Variants
Five tiers from Nano (on-device) to Pro ($30/$180) covering every deployment scenario
Software Engineering
SWE-bench Pro 57.7% with strong multi-file codebase understanding
Getting Started
Get API Access
Sign up at platform.openai.com and create an API key — choose between Standard, Mini, or Pro tiers based on your needs
Install the SDK
Run `pip install openai` (Python) or `npm install openai` (Node.js)
Make Your First Call
Use `client.chat.completions.create(model="gpt-5.4")` — add `reasoning_effort` parameter for the Thinking variant
Try ChatGPT
Access GPT-5.4 directly through chatgpt.com with a Plus ($20/mo) or Pro ($200/mo) subscription
Pros & Cons
Strengths
- +Highest AA Intelligence Index (tied #1 at 57)
- +Perfect AIME 2025 score (100%)
- +Strong agentic capabilities (GDPval #1)
- +Surpasses human expert on OSWorld (75%)
- +Multiple variants (Standard/Pro/Mini/Nano)
Weaknesses
- -Expensive ($2.50/$15 per MTok)
- -Closed source
- -Slow time-to-first-token for reasoning (153s)
- -Knowledge cutoff August 2025