Learn anything
Understand complex topics in a way that makes sense for you – with clearconciseand helpful responses
×
注意!页面内容来自https://deepmind.google/models/gemini/pro/,本站不储存任何内容,为了更好的阅读体验进行在线解析,若有广告出现,请及时反馈。若您觉得侵犯了您的利益,请通知我们进行删除,然后访问 原网页
A smarter model to help you learnplanand build like never before.
Understand complex topics in a way that makes sense for you – with clearconciseand helpful responses
Bring your ideas to life – from sketches and prompts to interactive tools and experiences
Delegate tasks and multi-step projects to get things done faster than ever before
Smartconcisedirect responses – with genuine insight over cliche and flattery.
Textimagesvideoaudio – even code. Gemini 3.1 Pro is state-of-the-art on reasoning with unprecedented depth and nuance.
Gemini 3.1 brings exceptional instruction following – with meaningful improved tool use and agentic coding.
Better tool use. Simultaneousmulti-step tasks. Gemini 3.1’s agentic capabilities can build more helpful and intelligent personal AI assistants.
Gemini 3.1 Pro uses advanced reasoning to configure live telemetry streams to build dynamic applications like this aerospace dashboard.
Gemini 3.1 Pro codes an immersive starling murmurationcomplete with hand-tracking manipulation and dynamic generative audio.
From terrain generation to traffic flowGemini 3.1 Pro uses advanced reasoning to code and assemble the many layers of a simulated city.
Gemini 3.1 Pro understands design intentconverting static SVGs into animatedcode-based graphics for fastercleaner web development.
Gemini 3.1 Pro reasons through the atmospheric tone of a novel to build a modernpersonalized portfolio.
Build with our new agentic development platform
Leap from prompt to production
Get started building with cutting-edge AI models
| Benchmark | Notes | Gemini 3.1 Pro Thinking (High) | Gemini 3 Pro Thinking (High) | Sonnet 4.6 Thinking (Max) | Opus 4.6 Thinking (Max) | GPT-5.2 Thinking (xhigh) | GPT-5.3-Codex Thinking (xhigh) |
|---|---|---|---|---|---|---|---|
| Humanity's Last Exam Academic reasoning (full settext + MM) | No tools | 44.4% | 37.5% | 33.2% | 40.0% | 34.5% | — |
| Search (blocklist) + Code | 51.4% | 45.8% | 49.0% | 53.1% | 45.5% | — | |
| ARC-AGI-2 Abstract reasoning puzzles | ARC Prize Verified | 77.1% | 31.1% | 58.3% | 68.8% | 52.9% | — |
| GPQA Diamond Scientific knowledge | No tools | 94.3% | 91.9% | 89.9% | 91.3% | 92.4% | — |
| Terminal-Bench 2.0 Agentic terminal coding | Terminus-2 harness | 68.5% | 56.9% | 59.1% | 65.4% | 54.0% | 64.7% |
| Other best self-reported harness | — | — | — | — | 62.2% (Codex) | 77.3% (Codex) | |
| SWE-Bench Verified Agentic coding | Single attempt | 80.6% | 76.2% | 79.6% | 80.8% | 80.0% | — |
| SWE-Bench Pro (Public) Diverse agentic coding tasks | Single attempt | 54.2% | 43.3% | — | — | 55.6% | 56.8% |
| LiveCodeBench Pro Competitive coding problems from CodeforcesICPCand IOI | Elo | 2887 | 2439 | — | — | 2393 | — |
| SciCode Scientific research coding | 59% | 56% | 47% | 52% | 52% | — | |
| APEX-Agents Long horizon professional tasks | 33.5% | 18.4% | — | 29.8% | 23.0% | — | |
| GDPval-AA Elo Expert tasks | 1317 | 1195 | 1633 | 1606 | 1462 | — | |
| τ2-bench Agentic and tool use | Retail | 90.8% | 85.3% | 91.7% | 91.9% | 82.0% | — |
| Telecom | 99.3% | 98.0% | 97.9% | 99.3% | 98.7% | — | |
| MCP Atlas Multi-step workflows using MCP | 69.2% | 54.1% | 61.3% | 59.5% | 60.6% | — | |
| BrowseComp Agentic search | Search + Python + Browse | 85.9% | 59.2% | 74.7% | 84.0% | 65.8% | — |
| MMMU-Pro Multimodal understanding and reasoning | No tools | 80.5% | 81.0% | 74.5% | 73.9% | 79.5% | — |
| MMMLU Multilingual Q&A | 92.6% | 91.8% | 89.3% | 91.1% | 89.6% | — | |
| MRCR v2 (8-needle) Long context performance | 128k (average) | 84.9% | 77.0% | 84.9% | 84.0% | 83.8% | — |
| 1M (pointwise) | 26.3% | 26.3% | Not supported | Not supported | Not supported | — |