Claude Opus 4.6
Hybrid reasoning model that pushes the frontier for coding and AI agentsfeaturing a 1M context window
Announcements
- NEW
Claude Opus 4.6
Feb 52026
Claude Opus 4.6 is our most capable model to date. Building on the intelligence of Opus 4.5it brings new levels of reliability and precision to codingagentsand enterprise workflows.
Read more
Claude Opus 4.5
Nov 242025
Claude Opus 4.5 is our most intelligent model to date. It sets a new standard across codingagentscomputer useand enterprise workflows. Opus 4.5 is a meaningful step forward in what AI systems can do.
Read more
Claude Opus 4.1
Aug 52025
Claude Opus 4.1 is a drop-in replacement for Opus 4 that delivers superior performance and precision for real-world coding and agentic tasks. It handles complexmulti-step problems with more rigor and attention to detail.
Read more
Claude Opus 4
May 222025
Claude Opus 4 pushes the frontier in codingagentic searchand creative writing. We’ve also made it possible to run Claude Code in the backgroundenabling developers to assign long-running coding tasks for Opus to handle independently.
Read more
Availability and pricing
For business users and consumers who want to collaborate with our most powerful model on complex tasksOpus 4.6 is available on Claude for ProMaxTeamand Enterprise users.
For developers interested in building AI solutions that demand frontier intelligenceOpus 4.6 is available on the Claude Platform nativelyand in Amazon BedrockGoogle Cloud’s Vertex AIand Microsoft Foundry. The 1M token context window is currently available in beta on the Claude Platform only.
Pricing for Opus 4.6 starts at $5 per million input tokens and $25 per million output tokenswith up to 90% cost savings with prompt caching and 50% savings with batch processing. To learn morecheck out our pricing page. To get starteduse claude-opus-4-6 via the Claude API.
For workloads that need to run in the USUS-only inference is available at 1.1x pricing for input and output tokens. Learn more.
Use cases
Opus 4.6 is a premium model that works best for tasks no prior model could handle and where performance matters most. It’s built for professional software engineeringcomplex agentic workflowsand high-stakes enterprise tasks.
Opus 4.6 offers hybrid reasoning that allows for instant responses or extended thinking. API users have fine-grained controls for adjusting the overall effort applied to a responsebalancing performance with latency and cost. Popular use cases include:
Advanced coding
Opus 4.6 can confidently deliver production-ready code with minimal oversight. It plans carefullyruns for longer with sustained effortand operates reliably in larger codebases. Strong code review and debugging skills means it catches its own mistakes. Senior engineers can delegate complex tasks with confidence.
AI agents
Opus 4.6 makes agents meaningfully more useful. It handles longermore complex task chains with fewer errors and less hand-holdingadapting its approach as conditions change. It is ideal for complexmulti-step agentic workflows where reliability and autonomy matter the most.
Enterprise workflows
Opus 4.6 brings a level of consistency that makes AI practical for sustainedhigh-stakes work. It maintains context and quality across large projects and shows strong performance on everyday tasks like working with documentsspreadsheetsand presentationsrunning financial analysesreading charts and diagramsand doing research. It delivers the precision and consistency that enterprise work demands.
Benchmarks
Claude Opus 4.6 is state-of-the-art across a wide range of coding and agentic capabilities.
Opus 4.6 demonstrates strong performance across many domains. It achieves industry-leading results with 65.4% on Terminal-Bench 2.0. It is also our best computer-using modelreaching 72.7% on OSWorld.

Trust and safety
Extensive testing and evaluation—conducted in partnership with external experts—ensures the release of Opus 4.6 meets Anthropic’s standards for safetysecurityand reliability. The accompanying model card covers safety results in depth.
Hear from our customers
Claude Opus 4.6 is a huge leap for agentic planning. It breaks complex tasks into independent subtasksruns tools and subagents in paralleland identifies blockers with real precision.
Claude Opus 4.6 is the best model we've tested yet. Its reasoning and planning capabilities have been exceptional at powering our AI Teammates. It's also a fantastic coding model – its ability to navigate a large codebase and identify the right changes to make is state of the art.
Claude Opus 4.6 is the strongest model Anthropic has shipped. It takes complicated requests and actually follows through; breaking them into concrete stepsexecutingand producing polished work even when the task is ambitious. For Notion usersit feels less like a tool and more like a capable collaborator.
Claude Opus 4.6 reasons through complex problems at a level we haven't seen before. It considers edge cases that other models miss and consistently lands on more elegantwell-considered solutions. We're particularly impressed with Opus 4.6 in Devin Reviewwhere it's increased our bug catching rates.
Across 40 cybersecurity investigationsClaude Opus 4.6 produced the best results 38 of 40 times in blind ranking against Claude 4.5 models. Each model ran end-to-end on the same agentic harness with up to 9 subagents and 100+ tool calls.
Claude Opus 4.6 is the new frontier on long-running tasks from our internal benchmarks and testing. It's also been highly effective at reviewing code.
Claude Opus 4.6 achieved the highest BigLaw Bench score of any Claude model at 90.2%. With 40% perfect scores and 84% above 0.8it's remarkably capable for legal reasoning.
Claude Opus 4.6 autonomously closed 13 issues and assigned 12 issues to the right team members in a single daymanaging a ~50-person organization across 6 repositories. It handled both product and organizational decisions while synthesizing context across multiple domainsand knew when to escalate to a human.
Claude Opus 4.6 is an uplift in design quality. It works beautifully with our design systems and it's more autonomouswhich is core to Lovable's values. People should be creating things that matternot micromanaging AI.
Both hands-on testing and evals show Claude Opus 4.6 is a meaningful improvement for design systems and large codebasesuse cases that drive enormous enterprise value. It also one-shotted a fully functional physics enginehandling a large multi-scope task in a single pass.
Claude Opus 4.6 is the biggest leap I've seen in months. I'm more comfortable giving it a sequence of tasks across the stack and letting it run. It's smart enough to use subagents for the individual pieces.
Claude Opus 4.6 handled a multi-million-line codebase migration like a senior engineer. It planned upfrontadapted its strategy as it learnedand finished in half the time.
Global enterprise clients bring us their hardest problems. Claude Opus 4.6 sets a new bar: reasoning sustains at depthit self-catches errorsand produces stronger outputs faster. Our 1,300 people can spend less time correcting and more time solving.
We only ship models in v0 when developers will genuinely feel the difference. Claude Opus 4.6 passed that bar with ease. Its frontier-level reasoningespecially with edge caseshelps v0 to deliver on our number one aim: to let anyone elevate their ideas from prototype to production.
Claude Opus 4.6 achieved 85% recall on our biopharma competitive intelligence benchmark—a 12-point lift over baseline (p<0.02; 100% Bayesian probability of improvement)—through autonomous 15-minute discovery loops with zero prompt tuning. On the hardest tasksthe improvement exceeded 30 points. For users who need to find every competitornot just the obvious onesthis lift makes a critical difference.
Claude Opus 4.6 scores 69% on Terminal Bench 2 in Droida clear jump from Opus 4.5. For autonomous software engineeringthat's a meaningful step forward.
Our hardest benchmark contains 200 analytical reasoning problems. Claude Opus 4.6 beat every model we've had in production. It's a clear candidate for production traffic.
Claude Opus 4.6 is the best orchestration model we've used for complex multi-agent work. It tracks how sub-agents are doingproactively steers themand terminates when needed. That kind of active management is new.
The performance jump with Claude Opus 4.6 feels almost unbelievable. Real-world tasks that were challenging for Opus suddenly became easy. This feels like a watershed moment for spreadsheet agents on Shortcut.
Claude Opus 4.6 is showing gains on solubility editing where previous models couldn't. It's the first improvement we've seen on one of the most challenging tasks in molecular design.
With Claude Opus 4.6creating financial PowerPoints that used to take hours now takes minutes. We're seeing tangible improvements in attention to detailspatial layoutand content structuring.
Claude Opus 4.6 generates complexinteractive apps and prototypes in Figma Make with an impressive creative range. The model translates detailed designs and multi-layered tasks into code on the first trymaking it a powerful starting point for teams to explore and build ideas.
"Early testing shows Claude Opus 4.6 delivering on the complexmulti-step coding work developers face every day—especially agentic workflows that demand planning and tool calling. This starts unlocking long horizon task at the frontier."
Claude Opus 4.6 just keeps working through problems without needing to be nudged. I ran it headlessly for much longer than any model we've used before. It's significantly more persistent and agentic.
Claude Opus 4.6 represents a meaningful leap in long-context performance. In our testingwe saw it handle much larger bodies of information with a level of consistency that strengthens how we design and deploy complex research workflows. Progress in this area gives us more powerful building blocks to deliver truly expert-grade systems professionals can trust.
Claude in Excel powered by Opus 4.6 represents a significant leap forward. From due diligence to financial modelingit’s proving to be a remarkably powerful tool for our team - taking unstructured data and intelligently working with minimal prompting to meaningfully automate complex analysis. It’s an excellent example of AI augmenting investment professionals’ capabilities in tangibletime-saving ways.
As one of Canada’s largest institutional investorswe’re constantly innovating and see AI at the forefront of shaping our future. Claude Opus 4.6's enhanced speedprecisionand capacity for complex taskslike multi-tab analysis in Claude for Excelunlock exciting possibilities for how we work.”
Claude Opus 4.6 feels noticeably better than Opus 4.5 in Windsurfespecially on tasks that require careful exploration like debugging and understanding unfamiliar codebases. We've noticed Claude Opus 4.6 thinks longerwhich pays off when deeper reasoning is needed.
Claude Opus 4.6 is now our default model. It outperforms other models on real workloadsespecially data retrieval and tool use.
Opus 4.6 is the best Anthropic model we’ve tested. It understands intent with minimal prompting and went above and beyondexploring & creating details I didn't even know I wanted until I saw them. It felt like I was working with the modelnot waiting on it.
Claude Opus 4.6 excels in high-reasoning taskslike multi-source analysisacross legalfinancialand technical content. Box’s eval showed a 10% lift in performancereaching 68% vs. a 58% baselineand near-perfect scores in technical domains.
Greptile pushes the frontier on long-horizon coding and reasoning tasks. Claude Opus 4.6 marks a large step forward in this space. We are excited to use it.
Anthropic already had the best coding model in the world and Opus 4.6 continues that trajectory. In our internal Auggie bench evalthis is the first time we've consistently seen the model’s coding output truly compare to expert human quality.
Claude Opus 4.6 delivers the depth and structure our users need on complex research queries. It gives thoroughevidence-backed responses that consistently outperform what we've seen from any other model.
Frequently asked questions
We offer Claude models across the spectrum of speedpriceand performance. Opus 4.6 is our most capable model to date. We recommend Opus 4.6 for your most demanding use cases where you need frontier intelligence—particularly production-ready codesophisticated AI agentsand complex document creation.
Pricing depends on how you want to use Opus 4.6. To learn morecheck out our pricing page.
Opus 4.6 is both a standard model and a hybrid reasoning model in one. You can pick when you want the model to answer normally and when you want it to use extended thinking.
Extended thinking mode is best for use cases where performance and accuracy matter more than latency. It significantly improves response quality for complex reasoning tasksextended agentic workmulti-step coding projectsand deep researchand the thinking summaries help you understand key aspects of the model’s reasoning process.