H
Howardismvol. 03 · quiet corner of the web
Plate IIHarnessHOWARDISM

Verification as the New Bottleneck

PublishedMay 23, 2026FiledConceptTopicHarnessTagsAgent EngineeringAI Coding WorkflowAI Native OrgReading5 minSourceAI-synthesised

Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax; PR-cycle-time funnel analysis

Illustration for Verification as the New Bottleneck

Sources#

Summary#

Fiona Fung's central claim from running Claude Code + Cowork engineering: for years, engineering bandwidth was the expensive resource — planning, reviews, and process all existed to protect it. Once agentic coding made coding cheap, the bottleneck moved to verification, review, and maintenance. "On the Claude Code team, coding is really not the slow part anymore." The new scarce resource is confidence that the change is correct — and it gets scarcer precisely because bandwidth (and therefore throughput) exploded.

Why verification is now the constraint#

Three forces converge:

  • Volume. Bandwidth increased so much that "we have to pay even more attention to: is it correct."
  • Blurring roles. More people (designers, managers, PMs) now check in changes, so everyone needs confidence their change is correct.
  • Maintenance cost. Higher throughput means more to maintain — the cost of maintenance becomes a first-class concern, not an afterthought.

This is the org-level mirror of Karpathy's The Verifiability Thesis ("LLMs automate what you can verify") and the demand side of Harness Shrinkage as Models Improve (prompt scaffolding shrinks; mechanical verification stays load-bearing).

TDD loses its tax#

A vivid sign of the shift: TDD used to feel like "eating broccoli" — write the failing test first, verify it fails, then fix. With Claude, Fung found it "so much more fun and pleasurable… it took the tax out of test-driven development." The economics flipped: when writing the test is nearly free, the discipline that grounds verification (a test that provably fails, then passes) is pure upside. (Cf. the tdd / red-green-refactor discipline; the failing-test-first step is the verifier.)

Shift left#

Her recurring phrase: shift left — catch problems closer to the source via automation, not after a customer hits them. "What's better than me running into the bug first? Having automation in place to catch it closer to the source." As throughput rises, the only way verification keeps up is by being automated and early rather than manual and late.

Who reviews — and the human-in-the-loop line#

Before shipping Claude Code's own code-review feature, "how do you keep up with code reviews?" was her most-asked question. The answer: Claude Code review handles style, lint, obvious bugs, and spec-drift (if you check the spec into the codebase, "Claude is very good about verifying against spec drift"). But humans stay in the loop where it matters: legal review, risk tolerance, trust boundaries — "trust but verify, and where humans bring needed expertise." The division of labor: automate the mechanical verification, reserve human judgment for risk and trust-boundary calls. (Cf. Deep Modules for Agents: reviewer in a fresh context.)

Measuring the shift (and a trap)#

Signals she watches: onboarding ramp-up time ↓, PR cycle time ↓, Claude-assisted commits ↑ ("I haven't seen a commit that wasn't Claude-assisted in months"). The trap: don't read end-to-end PR cycle time alone — break it into funnel chunks. If cycle time isn't dropping, it may not be low AI adoption; it could be CI/build systems jamming under the new throughput. And throughput isn't the goal — "find some way to measure whatever you're actually trying to solve," not just velocity.

Connections#

Open Questions#

  • Fung's own open question: "How far do you push fully automated reviews?" — where's the speed/safety balance, and how do you keep humans confident without re-introducing the review bottleneck?
  • If CI/build is the hidden jam, does verification infrastructure (test runners, CI capacity) become the actual capex of an AI-native org?

Sources#

§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 13
  • Anthropic

    AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…

  • Building Is Cheap, Arguing Is Expensive

    "In technical debate, code wins": generate three PRs vs whiteboard; prototype over design doc; reduce design docs

  • Cat Wu

    Head of Product for Claude Code and Cowork at Anthropic; primary articulator of AI-native product cadence and engineer-…

  • Claude Code

    Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…

  • Code as Source of Truth

    Docs go stale at high coding throughput; check specs/skills into the repo; onboard via Claude; spec-drift verification

  • Deep Modules for Agents

    Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…

  • Dogfooding as Product Discipline

    Product sense is built by relentless first-hand use ("ant food"); Mr. Peanut catch; cross-source (Cat Wu vibe-checks, G…

  • Evals as Product Spec

    Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…

  • Fiona Fung

    Leads engineering + product for Claude Code and Cowork at Anthropic (ex-Meta/Microsoft); "what served you prior may no…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • Product Velocity as Moat

    Shipping speed as differentiator + trust signal ("you'll scale with us"); a treadmill that must convert into durable lo…

  • The Verifiability Thesis

    LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…

  • Vibe Coding vs. Agentic Engineering

    Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x a…

Related articles
  • Fiona Fung

    Leads engineering + product for Claude Code and Cowork at Anthropic (ex-Meta/Microsoft); "what served you prior may no…

  • Claude Code

    Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…

  • Boris Cherny

    Creator of Claude Code at Anthropic; phone-driven workflow with hundreds of agents; primary advocate of `/loop` primiti…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • Evals as Product Spec

    Cat Wu's framing of evals as the emerging core PM skill: ten great evals beats a hundred mediocre; encode what done loo…