What's new?
Anthropic is shipping Claude Opus 4.6 with a feature built for technical teams: Agent Teams. Multiple autonomous agents work in parallel, coordinated by a "lead" agent — just like a DevOps team splitting up tasks.
- Agent Teams: Multi-agent architecture (in Claude Code). A lead agent delegates to specialized sub-agents that run in parallel
- 1M token context: Beta mode (vs. 200K standard). Pass an entire monorepo or 50+ config files in a single context
- 128K output tokens: Double the previous limit (64K). Generate complete IaC stacks in one pass — no truncation
- Adaptive Thinking: The model decides when to engage extended reasoning — use the
effortparameter (low/medium/high/max) instead of the deprecatedbudget_tokens - Compaction API (beta): Server-side automatic context summarization — near-infinite conversations without blowing your token budget
On Anthropic's published benchmarks, Opus 4.6 dominates. It leads GPT-5.2 by +144 Elo on GDPval-AA (knowledge work) and takes first place at launch on Terminal-Bench 2.0 (agentic coding). On BrowseComp (hard-to-find retrieval), it hits #1 industry-wide.
| capability | opus 4.5 | opus 4.6 | improvement |
|---|---|---|---|
| Context window | 200K standard | 200K (1M beta) | +400% in beta |
| Max output | 64K tokens | 128K tokens | x2 |
| Agent Teams | No | Yes (Claude Code) | New |
| Compaction API | No | Beta | New |
| MRCR v2 (1M needle)* | 18.5% (Sonnet 4.5) | 76% | x4 |
* Compared to Sonnet 4.5 (no Opus 4.5 data published for this benchmark)
What does this mean for DevOps and Platform Engineering teams?
Your next Kubernetes migration, fully orchestrated by agents:
- Agent 1 (Infrastructure): Generates Terraform manifests for the new EKS cluster
- Agent 2 (Configuration): Adapts Helm charts and ConfigMaps for the new environment
- Agent 3 (Migration): Writes the migration runbook and validation scripts
- Lead Agent: Coordinates all three agents, manages dependencies (Agent 2 waits for Agent 1 to finish), and compiles everything into a coherent strategy
The 128K output limit kills truncation. Request a complete AWS stack — VPC, subnets, security groups, ECS, RDS, monitoring — in a single prompt. No more splitting generation into chunks and stitching them back together.
The Compaction API solves the long-debugging-session problem. You're deep in a complex incident — two hours in, logs everywhere, a dozen hypotheses on the table. The API automatically summarizes context server-side, keeping the full history without blowing your token budget.
Our take
Agent Teams is currently limited to Claude Code — not yet in the standard API. But it's a preview of where AI-assisted DevOps orchestration is headed. You can already replicate this pattern with the API today: build your own orchestrator that spawns multiple Claude threads with distinct roles.
The feature we think matters most long-term is the Compaction API. It's a server-side solution that maintains near-infinite technical conversations without manual context pruning. For teams doing AI pair-programming on complex systems, that's the real differentiator for Opus 4.6.
One more thing: OpenAI responded the same day with GPT-5.3-Codex. The agentic model race is on, and technical teams are the winners. Standard pricing stays the same ($5/$25 per million tokens input/output), though the extended 1M token context carries a premium rate.