← Back to Hello, AI

Dispatches from the frontier

Weekly analysis, honest takes, and hidden gems. No engagement bait.

DeepSeek V4: The Open-Source Model Frontier Labs Feared

DeepSeek V4 ships under MIT with $0.30/M output tokens — 83x cheaper than Claude Opus 4.7 — while scoring 80.6% on SWE-bench Verified. The agentic-coding price floor just moved an order of magnitude.

Read article →

Gemini 3.1 Pro Review: The Reasoning Leader You Haven't Tested

Gemini 3.1 Pro scores 77.1% on ARC-AGI-2 — 24 points above GPT-5.5 — yet Arena Elo places it in a three-way tie. Here's what the leaderboard hides, and when the reasoning gap actually changes your routing decision.

Read article →

Google Just Bet $40B on Anthropic — What That Means for Your Stack

Google committed up to $40B in Anthropic on April 24 — the same week OpenAI launched a separate enterprise JV and GPT-5.5 doubled in price. The frontier market is hardening into two distribution channels, and the model is becoming the cheap part of the stack.

Read article →

Grok 4.3 Is Now the Cheapest Frontier Model

xAI's April 30 release prices Grok 4.3 at $1.25 input and $2.50 output per million tokens — undercutting Gemini 3.1 Pro by 79% and GPT-5.5 by 92% on output, while landing between Opus 4.7 and Gemini on agentic Elo.

Read article →

GPT-5.5 "Spud" Doubles Its Price — And Bets Agents Are Worth It

OpenAI shipped GPT-5.5 yesterday at $5/$30 per million tokens — exactly double GPT-5.4. Anthropic spent April cutting prices; OpenAI just opted out of the cost war and bet on agents instead.

Read article →

Claude Opus 4.7 and the End of the Frontier Cost War

Opus 4.7 ships at the same $5 input price as 4.6, with the same 1M context window, and a 1503 Elo that tops the Arena. That nothing about pricing moved is precisely the story.

Read article →

Claude Opus 4.6 Is Now 67% Cheaper — What Changes for Your Stack

Anthropic cut Opus 4.6 input pricing from $15 to $5 per million tokens. At that price it now undercuts Gemini and GPT-5.4 on input — and breaks the conventional cost-justification for tiered routing.

Read article →

The Advisor Strategy: How Anthropic Is Rethinking Model Costs

Pairing a cheap executor with an expensive Opus advisor that only speaks at decision forks. The numbers are hard to dismiss — and the mental model behind them matters more than the benchmarks.

Read article →

Claude Mythos: The AI Too Dangerous to Release

Anthropic built a model that dominates 17 of 18 benchmarks and achieves 100% on cybersecurity tasks — then decided the world isn't ready for it. Here's what that tells us about where AI is headed.

Read article →

Mistral 7B: The Surprisingly Powerful Open-Source Model You're Ignoring

Don't sleep on Mistral 7B. This openly available model punches above its weight, offering impressive performance and a permissive license – making it a crucial choice for developers.

Read article →

DeepMind's Emu: The Agent Nobody's Talking About (But Should Be)

Emu, a lightweight agent from DeepMind, consistently outperforms many larger models on complex reasoning tasks — and it’s incredibly accessible. Here’s why this tiny titan is a serious contender.

Read article →

Grok: xAI’s Hidden Heavyweight – It’s Not Just About Elon

Grok is rapidly emerging as a surprisingly capable model from xAI, and deserves your attention beyond the media circus. Here’s an honest look at what it does well and where it still falls short.

Read article →