Google Gemini 3.1 Pro
ClosedMultimodalGoogle's frontier multimodal model with #1 GPQA Diamond score, native video and audio input, 1M context window, and fast inference at competitive pricing.
Overview
Gemini 3.1 Pro, released on February 19, 2026, is Google DeepMind's frontier multimodal model and the first to claim the #1 spot on GPQA Diamond at 94.3% — the hardest graduate-level science reasoning benchmark available. Unlike models that process different modalities through separate encoders stitched together, Gemini 3.1 Pro handles text, images, audio, and video natively within a unified architecture, with a 1M token context window and 65K output token limit.
The model introduces configurable thinking levels (Low, Medium, High) that let developers control the depth of reasoning per request. At High thinking, it tackles complex multi-step problems with extended chain-of-thought; at Low, it responds quickly for simpler queries. This flexibility is particularly valuable for production systems where latency budgets vary across different use cases within the same application.
What sets Gemini 3.1 Pro apart from competitors is its native video and audio input processing. You can feed it raw video files and audio streams directly — no preprocessing, no transcription step, no frame extraction. This makes it uniquely capable for tasks like analyzing meeting recordings, understanding video content, or processing podcast audio where other models require separate toolchains. It achieved 92.6% on MMMLU (multilingual understanding) and 80.6% on SWE-bench Verified, proving it competes at the frontier across domains.
Google offers a free tier on Google AI Studio for experimentation, with expected GA pricing around $1.50/$10 per million tokens (doubling to $4/$18 for extended context beyond 200K). The 119 tok/s inference speed and strong Arena ELO of 1493 make it a serious contender for teams already in the Google Cloud ecosystem or those who need true multimodal input processing.
Release Date
2026-02-19
Parameters
Unknown (MoE)
Context Window
1.0M tokens
Input Price
$2 / 1M tokens
Output Price
$12 / 1M tokens
Speed
119 tokens/sec
Benchmarks
| Benchmark | Score | Max |
|---|---|---|
| GPQA Diamond | 94.3% | 100% |
| AA Index | 57% | 100% |
| Arena ELO | 1493 | 2000 |
| SWE-bench Verified | 80.6% | 100% |
| MMMLU | 92.6% | 100% |
| HLE | 51.4% | 100% |
Capabilities
Science Reasoning
#1 GPQA Diamond at 94.3%, the strongest graduate-level science reasoning among all models
Native Video & Audio
Process raw video files and audio streams directly without preprocessing or transcription
Configurable Thinking
Three levels (Low/Medium/High) to balance reasoning depth against latency per request
Long Context
1M token context window with 65K output limit for comprehensive document analysis
Multilingual Understanding
MMMLU 92.6%, strong performance across languages and cultural contexts
Software Engineering
SWE-bench Verified 80.6%, competitive with frontier coding models
Getting Started
Try for Free
Open Google AI Studio (aistudio.google.com) and select Gemini 3.1 Pro — no API key needed for the free tier
Get API Access
Create a project in Google AI Studio or Google Cloud, then generate an API key
Install the SDK
Run `pip install google-genai` (Python) or `npm install @google/genai` (Node.js)
Make Your First Call
Use `client.models.generate_content(model="gemini-3.1-pro")` — pass video/audio files directly as input parts
Pros & Cons
Strengths
- +#1 GPQA Diamond (94.3%) — best science reasoning
- +Tied #1 AA Intelligence Index (57)
- +1M token context window
- +Fast inference (119 tok/s)
- +Supports video and audio input natively
- +Free tier on Google AI Studio
Weaknesses
- -Still in preview (not GA)
- -Extended context pricing doubles ($4/$18)
- -Closed source
- -Pricing may change at GA