Analysis3 minApr 10, 2026

The Advisor Strategy: How Anthropic Is Rethinking Model Costs

Pairing a cheap executor with an expensive Opus advisor that only speaks at decision forks. The numbers are hard to dismiss — and the mental model behind them matters more than the benchmarks.

Anthropic just mass-produced the senior engineer. Their new Advisor Strategy, announced on the company blog, pairs a cheap executor model — Sonnet or Haiku — with an expensive Opus advisor that only gets consulted at decision forks. The executor handles all tool calls, iteration, and user-facing output. The advisor never touches tools directly; it responds with 400-700 tokens of guidance and steps back. A new advisor_20250301 tool in the API makes the handoff explicit and programmable.

The results are hard to dismiss. Sonnet paired with an Opus advisor gains 2.7 percentage points on SWE-bench while costing 11.9% less than running Opus end-to-end. The Haiku numbers are more dramatic: performance on BrowseComp more than doubled, jumping from 19.7% to 41.2%, at 85% less cost than Sonnet running solo. These aren't cherry-picked micro-benchmarks. SWE-bench tests real software engineering tasks, and BrowseComp measures complex information retrieval — both proxies for the kind of multi-step work agentic systems actually do in production.

What makes this interesting isn't the benchmark lift. It's the underlying insight about where intelligence actually matters in an agentic loop. Most of the work — calling tools, reading file contents, formatting output, retrying on errors — is mechanical. A smaller model handles it fine. The moments that determine success or failure are the decision forks: which file to edit, whether to refactor or patch, how to decompose an ambiguous task. The advisor pattern surgically applies expensive reasoning exactly at those inflection points and nowhere else.

The cost-control mechanism is worth noting for anyone building production systems. Developers set a max_uses parameter to cap how many times the advisor gets consulted per task, making spend predictable rather than open-ended. This is a meaningful design choice. Unpredictable API costs have been a real barrier to deploying agentic workflows at scale, and hard caps on the expensive component change the economics from hope-based budgeting to something you can actually model in a spreadsheet.

The deeper implication is architectural. This isn't a new model — it's a new mental model for allocating intelligence across a system. It mirrors how effective engineering teams already work: junior engineers handle execution velocity, senior engineers weigh in at the hard calls. Expect this pattern to generalize beyond Anthropic's API. Any system that routes between models of different capability and cost will benefit from making the routing explicit rather than implicit. The competitive advantage shifts from who has the biggest model to who has the smartest routing.

← More from Hello, AI