← Back to Hello, AI

Dispatches from the frontier

Weekly analysis, honest takes, and hidden gems. No engagement bait.

Claude Opus 4.8: The Upgrade Is the Workflow, Not the Model

Opus 4.8 ships 41 days after 4.7 at the same price, with a one-point SWE-bench bump and a new engine that fans a single task across hundreds of parallel subagents. The frontier race is moving from IQ to orchestration.

Read article →

Most Agentic AI Projects Are Already Failing

Only 15% of organizations are production-ready for agents, yet 41% are running them anyway. The bottleneck isn't model intelligence — it's the data and ops layer underneath.

Read article →

Mythos: The First Frontier Model Too Dangerous to Ship

Anthropic built Claude Mythos to autonomously find zero-day vulnerabilities, then refused to ship it publicly, gating access through the Project Glasswing consortium.

Read article →

Gemini 3.5 Flash: Faster, Cheaper, and Beating 3.1 Pro

Gemini 3.5 Flash beats its own 3.1 Pro flagship on agentic and coding benchmarks at $1.50/$9 per million tokens — the most cost-competitive frontier-class model launched this week.

Read article →

GLM-4.6: The MIT-Licensed Model Closing the Open Gap

Z.ai's GLM-4.6 lands within ~60 Elo of the closed frontier and ships under a no-strings MIT license — the best open-weight model nobody in Western dev circles is discussing.

Read article →

DeepSeek V4: The Open-Source Model Frontier Labs Feared

DeepSeek V4 ships under MIT with $0.30/M output tokens — 83x cheaper than Claude Opus 4.7 — while scoring 80.6% on SWE-bench Verified. The agentic-coding price floor just moved an order of magnitude.

Read article →

Gemini 3.1 Pro Review: The Reasoning Leader You Haven't Tested

Gemini 3.1 Pro scores 77.1% on ARC-AGI-2 — 24 points above GPT-5.5 — yet Arena Elo places it in a three-way tie. Here's what the leaderboard hides, and when the reasoning gap actually changes your routing decision.

Read article →

Google Just Bet $40B on Anthropic — What That Means for Your Stack

Google committed up to $40B in Anthropic on April 24 — the same week OpenAI launched a separate enterprise JV and GPT-5.5 doubled in price. The frontier market is hardening into two distribution channels, and the model is becoming the cheap part of the stack.

Read article →

Grok 4.3 Is Now the Cheapest Frontier Model

xAI's April 30 release prices Grok 4.3 at $1.25 input and $2.50 output per million tokens — undercutting Gemini 3.1 Pro by 79% and GPT-5.5 by 92% on output, while landing between Opus 4.7 and Gemini on agentic Elo.

Read article →

GPT-5.5 "Spud" Doubles Its Price — And Bets Agents Are Worth It

OpenAI shipped GPT-5.5 yesterday at $5/$30 per million tokens — exactly double GPT-5.4. Anthropic spent April cutting prices; OpenAI just opted out of the cost war and bet on agents instead.

Read article →