Anthropic dropped Claude Opus 4.6 on Thursday, and this one feels different. The company's flagship model now lets you run multiple AI agents simultaneously, coordinating complex tasks like a real team rather than a single assistant working through a queue.

Just three months after releasing Opus 4.5 last November, Anthropic is pushing hard to expand what its most powerful model can do. The headline feature, "agent teams," addresses a fundamental limitation that's frustrated developers building with AI: the sequential bottleneck. Until now, even the most capable AI could only work on one thing at a time. That constraint just got lifted.


Agent Teams: Multiple AIs Working Together

The most significant addition to Opus 4.6 is what Anthropic calls "agent teams," a feature that lets multiple Claude instances split larger projects into parallel workstreams. Instead of one agent grinding through tasks sequentially, you can now distribute work across several agents that each own their piece and coordinate directly with each other.

Scott White, Head of Product at Anthropic, compared it to having a talented human team at your disposal. The agents can "coordinate in parallel [and work] faster" by dividing responsibilities rather than waiting in line. In practical terms, this means a codebase review that might have taken hours can now happen simultaneously across multiple sections, with agents communicating about dependencies and sharing context as they go.

The feature works through a lead session that coordinates the overall project, assigns tasks to team members, and synthesizes results. Each team member runs as an independent session with its own context window, but they can communicate directly and access a shared task list. Agents can claim tasks themselves or be assigned specific jobs, then work on different problems at the same time.

Currently, agent teams are available as a research preview for API users and subscribers. You activate the feature through an environment variable (CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1), and each agent instance is billed separately. Anthropic positions this for complex collaboration scenarios where multiple perspectives or parallel solution approaches are required.

The timing is notable. OpenAI released its updated GPT-5.3 Codex almost simultaneously, which also enables multi-agent orchestration. The parallel development suggests both companies see agent coordination as the next critical capability for AI development tools.


A Million Tokens of Memory

Opus 4.6 brings a one million token context window, the first model in Anthropic's Opus class to reach that threshold. This matches what the company's Sonnet models (versions 4 and 4.5) currently offer and dramatically expands what the model can keep in mind during a single session.

For context, one million tokens translates to roughly 750,000 words, or about 1,500 pages of dense technical documentation. In practical terms, this means the model can process entire codebases, lengthy legal documents, or extensive research papers without losing track of early details by the time it reaches the end.

The context window remains in beta, but it represents a meaningful upgrade for enterprise workflows that involve large document sets or complex codebases. When you're debugging across hundreds of files or analyzing a stack of contracts, the ability to maintain coherent understanding across that volume of information matters.

Additionally, maximum output length has been doubled to 128,000 tokens, letting the model produce longer responses when needed. Combined with a new feature called "context compaction" that summarizes older context to make room for new inputs, Opus 4.6 can handle longer-running tasks without hitting limits that previously forced conversation restarts.


Adaptive Thinking: Smarter Resource Allocation

Opus 4.6 introduces "adaptive thinking," which lets the model gauge how much computational effort a prompt requires and adjust accordingly. Previously, developers could only enable or disable extended thinking as a binary option. Now Claude can read contextual clues and scale its reasoning depth automatically.

For developers who want explicit control, the /effort parameter lets you specify four levels (low, medium, high, max) to make direct tradeoffs between response quality, inference speed, and cost. Simple queries get quick answers; complex problems get deeper analysis. This granularity helps optimize both performance and spending.

The feature addresses a common frustration with frontier models: you often pay for heavy computation on straightforward questions that don't need it. Adaptive thinking means simple tasks stay fast and cheap while complex challenges get the reasoning resources they require.


PowerPoint Integration

Among the more practical additions, Opus 4.6 integrates Claude directly into PowerPoint as an accessible side panel. This represents an upgrade from previous integrations where users could ask Claude to create a presentation, but then had to transfer the file to PowerPoint for editing.

Now the entire workflow happens within PowerPoint. You can craft presentations with Claude's assistance without the friction of export, import, and manual adjustment. For knowledge workers who build decks regularly, this eliminates a repetitive handoff that added time to every project.

The integration reflects Anthropic's broader push to make Opus useful beyond software development. While the model built its reputation on coding capabilities, the company sees opportunity in the wider knowledge worker market.


Beyond Developers: A Broader Audience

White told TechCrunch that Opus has evolved from a model highly capable in software development into one that could be "really useful for a broader set" of knowledge workers. The observation came from watching how people actually use the tools.

"We noticed a lot of people who are not professional software developers using Claude Code simply because it was a really amazing engine to do tasks," White said. The user base has expanded to include product managers, financial analysts, and people from a variety of industries who adopted the tool for its task execution capabilities rather than its coding strength.

This broadening appeal aligns with how Anthropic has positioned Opus 4.6's improvements. The model reportedly excels at producing documents, spreadsheets, and presentations that look and read like expert-created work. According to Anthropic, it understands the conventions and norms of professional domains, making it suitable for finance, legal, and other precision-critical industries.


Benchmark Performance: Setting New Standards

Anthropic claims Opus 4.6 delivers major improvements across virtually every benchmark compared to its predecessor and many competitors. On Terminal-Bench 2.0, which tests agent-based programming, the model scores 65.4% versus 59.8% for Opus 4.5. On the OSWorld agentic computer use benchmark, scores rose from 66.3% to 72.7%.

Perhaps more significantly, Opus 4.6 shows substantial gains on the GDPval-AA test, which measures how well AI models can perform economically relevant work tasks. Here, the model surpasses OpenAI's GPT-5.2 by 144 Elo points and its direct predecessor Opus 4.5 by 190 Elo points.

The model also leads in the "Humanity's Last Exam" reasoning benchmark and performs well on tests examining tool usage and complex bug diagnosis. According to Anthropic, it "gets much closer to production-ready quality on the first try than what we've seen with any model."

Some regressions appeared on SWE-bench verified tests and the MCP Atlas benchmark for testing tool usage. Those anomalies stand out given the model's strong performance on similar benchmarks, suggesting specific edge cases rather than broader capability gaps.


Pricing and Availability

Opus 4.6 maintains the same pricing as its predecessor: $5 per million input tokens and $25 per million output tokens. For premium requests exceeding 200,000 tokens, prices increase to $10 and $37.50 respectively. A new option allows customers to run inference exclusively in the United States for a 10% surcharge, addressing digital sovereignty requirements for regulated industries.

The model is already available through Anthropic's API and has launched on both Amazon Bedrock and Microsoft Azure's Foundry platform. These cloud partnerships expand access for enterprise customers who prefer to run AI workloads within their existing infrastructure relationships.

Agent teams and the million-token context window both carry beta or research preview designations, suggesting ongoing refinement. But the core capabilities are production-ready and available now for organizations ready to integrate them.


What This Means Going Forward

The release of Opus 4.6 accelerates a trend that's been building across the AI industry: the shift from single-agent assistants to coordinated multi-agent systems. When you can split complex work across multiple AI instances that communicate and collaborate, the ceiling on what AI can accomplish rises substantially.

For developers, agent teams open new architectural possibilities. Complex projects that previously required careful sequential orchestration can now run in parallel, cutting time to completion while maintaining coherence across workstreams. The feature is especially promising for read-heavy work like codebase reviews, where multiple agents can analyze different sections simultaneously.

For enterprise knowledge workers, the combination of expanded context windows, native productivity tool integrations, and improved professional output quality makes Opus 4.6 a more practical choice for real workflows. The model handles large codebases, lengthy documents, and complex presentations with fewer limitations than before.

The AI landscape continues accelerating, with Anthropic, OpenAI, and Google all pushing frontier capabilities on parallel tracks. Opus 4.6 represents Anthropic's current answer to what comes after the single-agent paradigm, and the agent teams feature may signal where the entire industry is heading.


FAQ

What is Claude Opus 4.6?

Claude Opus 4.6 is Anthropic's newest flagship AI model, released in February 2026. It introduces agent teams that enable multiple AI instances to work in parallel on complex tasks, a one million token context window (in beta), and improved performance across coding, reasoning, and professional work benchmarks. It succeeds Opus 4.5 which was released in November 2025.

What are agent teams in Opus 4.6?

Agent teams allow developers to run multiple Claude Code instances simultaneously, with each agent owning a piece of a larger project. A lead session coordinates work, assigns tasks, and synthesizes results while team members communicate directly and access shared task lists. The feature enables parallel processing of complex tasks rather than sequential execution.

How much does Claude Opus 4.6 cost?

Pricing remains the same as Opus 4.5: $5 per million input tokens and $25 per million output tokens. For premium requests exceeding 200,000 tokens, prices increase to $10 and $37.50 respectively. A 10% surcharge applies for inference running exclusively in the United States. Agent teams incur higher costs since each instance is billed separately.

What is the context window size for Opus 4.6?

Opus 4.6 offers a one million token context window, equivalent to approximately 750,000 words or 1,500 pages of documentation. This matches the context window available in Anthropic's Sonnet 4 and 4.5 models. The feature remains in beta and enables work with larger codebases and longer documents within a single session.

How does Opus 4.6 compare to GPT-5.2?

According to Anthropic's benchmarks, Opus 4.6 outperforms OpenAI's GPT-5.2 on several key metrics, including the GDPval-AA test for economically relevant work tasks by 144 Elo points. It also shows strong performance on Terminal-Bench 2.0 for agentic coding and OSWorld for computer use. OpenAI released its competing GPT-5.3 Codex almost simultaneously.

What is adaptive thinking in Opus 4.6?

Adaptive thinking allows Opus 4.6 to automatically determine how much computational effort to invest in a prompt based on contextual clues. Previously, developers could only enable or disable extended thinking as a binary option. Now the model scales reasoning depth automatically, or developers can use the /effort parameter to specify low, medium, high, or max effort levels.

Is Opus 4.6 available on cloud platforms?

Yes, Opus 4.6 is available through Anthropic's API and has launched on Amazon Bedrock and Microsoft Azure's Foundry platform. Enterprise customers can run the model within their existing cloud infrastructure relationships with full access to the new capabilities including agent teams and extended context windows.

What industries benefit most from Opus 4.6?

According to Anthropic, Opus 4.6 serves a broad range of knowledge workers beyond software developers. The company specifically mentions product managers, financial analysts, legal professionals, and workers in precision-critical industries. The model's improved ability to produce professional-quality documents, spreadsheets, and presentations makes it suitable for finance, legal, and enterprise workflows.

How do I enable agent teams?

Agent teams are currently available as a research preview. Developers can activate the feature by setting the environment variable CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1. The feature works best for complex collaboration scenarios requiring multiple perspectives or parallel solution approaches. Note that each agent instance is billed separately, increasing token costs.

What new integrations does Opus 4.6 include?

Opus 4.6 integrates Claude directly into PowerPoint as an accessible side panel, allowing users to create and edit presentations within PowerPoint with AI assistance. Previously, users had to create presentations with Claude and then transfer files to PowerPoint for editing. The native integration eliminates this workflow friction for knowledge workers who build decks regularly.


Claude Code: Anthropic’s Revolutionary AI Programming Workspace Has Arrived
The AI coding assistant that lives in your terminal, understands your entire codebase, and can now be accessed directly from your browser at Claude
ChatGPT vs Claude vs Gemini vs Grok vs DeepSeek vs Perplexity vs Manus - 1 Year of Testing All Major AI Platforms
Personal Experience with ChatGPT, Claude, Gemini, Grok, DeepSeek, Perplexity, and Manus - An Honest Breakdown with Professional Data
Claude Sonnet 4.5 vs Opus 4.5: The Complete Comparison
What Changed Between Anthropic’s Sonnet and Opus AI Coding Models