OpenAI GPT-5.4

ClosedMultimodal

OpenAI

OpenAI's flagship frontier model with perfect AIME 2025 score, leading agentic capabilities, and multiple size variants including Standard, Pro, Mini, and Nano.

Overview

GPT-5.4, released on March 5, 2026, is OpenAI's flagship frontier model and the culmination of their mixture-of-experts architecture scaling. It features a 1.05M token context window (922K input + 128K output), the largest usable context among closed-source competitors at launch. The model comes in five variants — Standard, Thinking, Pro, Mini, and Nano — spanning from a $180/month Pro subscription tier down to the lightweight Nano designed for on-device deployment.

The headline benchmark result is a perfect 100% on AIME 2025, the first model to achieve a flawless score on the American Invitational Mathematics Examination. But arguably more significant is its OSWorld performance: 75% versus the human expert baseline of 72.4%, marking the first time an AI model has surpassed human-level performance on real-world computer use tasks including web browsing, file management, and application interaction.

GPT-5.4's architecture uses a mixture-of-experts approach with a knowledge cutoff of August 2025. The Thinking variant introduces extended reasoning with a notable trade-off: time-to-first-token can reach 153 seconds on complex problems as the model works through its chain-of-thought before generating output. The Pro variant ($30 input / $180 output per MTok) unlocks the deepest reasoning capabilities, while Mini ($0.40/$1.60) offers a compelling balance for production workloads.

One practical consideration is cached input pricing at $1.25/MTok, which makes repeated prompts with shared prefixes substantially cheaper. The model scores well across the board — 92.8% GPQA Diamond, 57.7% SWE-bench Pro, and a tied-#1 AA Intelligence Index of 57 — positioning it as a strong generalist that particularly excels at agentic computer use and mathematical reasoning.

Release Date

2026-03-05

Parameters

Unknown (MoE)

Context Window

1.1M tokens

Input Price

$2.5 / 1M tokens

Output Price

$15 / 1M tokens

Speed

83.5 tokens/sec

Benchmarks

BenchmarkScoreMax
AIME 2025100%100%
GPQA Diamond92.8%100%
AA Index57%100%
SWE-bench Pro57.7%100%
Arena ELO14842000
HLE44.3%100%

Capabilities

Mathematical Reasoning

Perfect 100% on AIME 2025, first model to achieve a flawless score on this competition-level math exam

Agentic Computer Use

75% on OSWorld surpassing human expert baseline (72.4%), handling real web browsing and app interaction

Science Reasoning

GPQA Diamond 92.8%, strong across physics, chemistry, and biology graduate-level questions

Extended Context

1.05M token window (922K input + 128K output) with cached input at $1.25/MTok

Model Variants

Five tiers from Nano (on-device) to Pro ($30/$180) covering every deployment scenario

Software Engineering

SWE-bench Pro 57.7% with strong multi-file codebase understanding

Getting Started

1

Get API Access

Sign up at platform.openai.com and create an API key — choose between Standard, Mini, or Pro tiers based on your needs

2

Install the SDK

Run `pip install openai` (Python) or `npm install openai` (Node.js)

3

Make Your First Call

Use `client.chat.completions.create(model="gpt-5.4")` — add `reasoning_effort` parameter for the Thinking variant

4

Try ChatGPT

Access GPT-5.4 directly through chatgpt.com with a Plus ($20/mo) or Pro ($200/mo) subscription

Pros & Cons

Strengths

  • +Highest AA Intelligence Index (tied #1 at 57)
  • +Perfect AIME 2025 score (100%)
  • +Strong agentic capabilities (GDPval #1)
  • +Surpasses human expert on OSWorld (75%)
  • +Multiple variants (Standard/Pro/Mini/Nano)

Weaknesses

  • -Expensive ($2.50/$15 per MTok)
  • -Closed source
  • -Slow time-to-first-token for reasoning (153s)
  • -Knowledge cutoff August 2025

Best For

General-purpose frontier tasksAgentic workflowsMath reasoningEnterprise applications

Compare

Official Links