Plate IIAI Engineering中文HOWARDISM

Vibe Coding vs. Agentic Engineering

PublishedMay 23, 2026FiledConceptDomainAI EngineeringTagsAgent Engineering AI Coding WorkflowReading7 minSourceAI-synthesised

Vibe coding raises the floor (anyone builds); agentic engineering preserves the quality bar while going faster; ">10x and widening"; hire on big projects, not puzzles

Illustration for Vibe Coding vs. Agentic Engineering

Sources#

Summary#

Andrej Karpathy coined "vibe coding" in 2025 and, a year later, names its serious successor: agentic engineering. The distinction is about which bar moves. Vibe coding raises the floor — anyone can build software now. Agentic engineering preserves the quality bar of professional software while going much faster: "you're not allowed to introduce vulnerabilities due to vibe coding; you're still responsible for your software, but can you go faster — and how do you do that properly?" It's an engineering discipline for coordinating spiky, fallible, stochastic-but-powerful agents without sacrificing quality.

The two bars#

Vibe coding — floor up. Everyone can vibe-code anything. "Amazing, incredible." Democratization (cf. Printing Press Software Democratization). Quality is not the point; access is.
Agentic engineering — ceiling up, quality held. You keep the responsibilities of professional software (security, correctness, maintainability) and use agents to go faster without dropping below that bar. "Doing that well and correctly is the realm of agentic engineering."

These are different activities, not points on one line. One lowers the entry cost; the other raises the output ceiling for people who already clear the bar.

"10x is not the speedup"#

Karpathy explicitly retires the old "10x engineer" trope as too small: "10x is not the speedup you gain… people who are very good at this peak a lot more than 10x." The ceiling on agentic-engineering capability is very high, and the spread between mediocre and AI-native practitioners widens, not narrows. (Echoes Harness Shrinkage as Models Improve: the leverage keeps growing as models improve; the binding constraint becomes the operator's taste — see Outsource Your Thinking, Not Your Understanding.)

What an AI-native practitioner looks like#

Asked to contrast a mediocre vs. a fully AI-native user of cloud code / codex / open claw, Karpathy's answer is mundane and important: invest in your setup, use all the tool's features. Same as the engineers who got the most out of Vim or VS Code — now applied to Claude Code / Codex. Mastery is configuration-and-features fluency, not a secret prompt.

Hiring has to be refactored#

A practical corollary: most teams still hire with the old paradigm (puzzles, leetcode). Karpathy argues agentic-engineering hiring should look like "give me a really big project and watch someone implement it well" — e.g., build a secure Twitter-clone-for-agents, then a red-team agent ("codex 5.4 xhigh") tries to break it and can't. Hiring should test verifiable, end-to-end build-and-defend ability, not isolated puzzle-solving. (See The Verifiability Thesis for why "and it can't be broken" is the load-bearing half.)

The human residue#

Even at the high ceiling, the human stays in charge of spec, taste, judgment, and oversight — agents do the fill-in-the-blanks. His MenuGen war story: the agent matched Stripe and Google accounts by email address instead of a persistent user ID — "such a weird thing to do," the kind of mistake Jagged Intelligence (Ghosts, Not Animals) predicts. You must design the spec ("these must be unique user IDs we tie everything to") and supply the taste; the agent handles the API details you've stopped memorizing.

The moving goalposts: supervised vs. unsupervised (Ambrosino)#

Andrew Ambrosino (OpenAI Codex) restates the same "which bar moves" distinction as a moving-goalposts observation, and welcomes the movement as evidence of progress. Asked what fraction of the product is AI-written: "If you use the goalposts from last year, 100% of our product is AI-written code. So the question is more like — is the code written supervised versus unsupervised? That's a totally different thing. I welcome the moving of goalposts, because that means we're making progress." The salient axis is no longer human-vs-AI authorship (settled) but how much human supervision the authoring still needs — which is exactly the agentic-engineering "quality bar held while going faster" line, measured as supervision cost.

He also captures the interactive form as "coding is steering the AI": the honest measure of AI's contribution isn't "what percent of my code did AI write" but "how many times did I have to steer it in the right direction" — the allocator/steerer role, restated. And on the frontier: "loops are so last week." The leading edge has moved past orchestrated loops to autonomous development and harness engineering — e.g. an agent doing overnight "garbage collection" of the codebase — though he flags it isn't there yet (models "usually increase complexity" and are bad at deleting code). This dates the vibe-coding→agentic-engineering ladder from the OpenAI side: the practitioner question is now supervised-vs-unsupervised and how autonomous a loop you can trust, not can the model write it.

Connections#

Building Is Cheap, Arguing Is Expensive — cheap generation is what lets prototypes settle technical debates
Andrej Karpathy — coined both terms
Software 3.0 — the paradigm both activities operate in
Jagged Intelligence (Ghosts, Not Animals) — why agentic engineering needs human oversight: agents make weird, spiky mistakes
The Verifiability Thesis — the discipline leans on verifiable build-and-defend tasks
Outsource Your Thinking, Not Your Understanding — taste/judgment/spec is the residual human bottleneck the discipline turns on
Harness Shrinkage as Models Improve — the ">10x and widening" leverage curve as models improve
Printing Press Software Democratization — the floor-raising half is the same democratization Boris Cherny describes
Claude Code Best Practices — concrete agentic-engineering practice (explore→plan→code, verification-driven)
Verification as the New Bottleneck — Fiona Fung's org-level account of "preserve the quality bar while going faster"
Claude Code — the surface where this plays out
Acceleration Whiplash — the dark mirror: Faros AI's industry telemetry of what the quality bar does when orgs don't practice agentic engineering — it drops (bugs/dev +54%, incidents per PR +242.7%)
AI as Primary Author — "you're still responsible for your software" is the responsibility half of the authorship/accountability gap Faros documents at scale
Returns to Expertise in Agentic Coding — the two-bars thesis, measured: Anthropic's 400K-session study finds occupation barely matters (the floor rose — anyone within 7pp of a software engineer) while domain expertise still decides success (the bar held)
Andrew Ambrosino — the OpenAI-side "supervised vs. unsupervised" / "coding is steering" / "loops are so last week" restatement
Compute Allocator — "coding is steering the AI" is the allocator role: measure the steers, not the lines
Agentic Technical Debt — the barrier to fully-unsupervised loops Ambrosino names: models increase complexity and are bad at deleting code

Open Questions#

Karpathy hints at "one domain that's very [valuable]" for founders but won't say which (didn't want to "vague-post on stage"). What verifiable RL-environment domain is he gesturing at?
If the mediocre/AI-native spread keeps widening, what does that do to team composition — a few extreme outliers plus agents, vs. broad mid-level staffing?

Sources#

Andrej Karpathy: From Vibe Coding to Agentic Engineering
OpenAI Codex lead on the new shape of product work — Ambrosino: "supervised versus unsupervised"; "coding is steering the AI"; "loops are so last week"

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 20

Acceleration Whiplash
Faros 2026: AI floods a human-paced SDLC with output it can't absorb — throughput up (tasks +34%, epics +66%), quality…
Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Agentic Technical Debt
Debt that *compounds* (not just accumulates) because each agentic-coding session re-derives architectural decisions wit…
AI as Primary Author
Faros 2026: the assistant→author threshold crossed without a deliberate decision, marked by AI-code acceptance rising 2…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Boris Cherny
Creator of Claude Code at Anthropic; phone-driven workflow with hundreds of agents; primary advocate of `/loop` primiti…
Building Is Cheap, Arguing Is Expensive
"In technical debate, code wins": generate three PRs vs whiteboard; prototype over design doc; reduce design docs
Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Claude Design
Anthropic Labs product (research preview, ~April 2026) for collaborating with Claude on polished visual artifacts — des…
Compute Allocator
The human's evolving role: deciding what's worth spending compute on; ~1% of generated tokens ship, 99% is scaffolding…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Jagged Intelligence (Ghosts, Not Animals)
"Ghosts not animals": jagged statistical circuits, no intrinsic motivation; car-wash/strawberry failures; stay in the l…
Loop Engineering
Replacing yourself as the agent's prompter by designing the system that prompts it: a recursive-goal loop built from fi…
AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 45 concepts. Curated entry point; see Home for all domains.
Open Questions Backlog
_124 pages with open questions, as of 2026-06-19._
Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…
Printing Press Software Democratization
Boris Cherny's analogy: 1400s literacy expansion → AI software-writing expansion; domain knowledge displaces coding ski…
Returns to Expertise in Agentic Coding
Anthropic's 400K-session study: domain expertise (not coding skill) is what amplifies an agent — experts get 2× the act…
Software 3.0
Karpathy's taxonomy: 1.0 code, 2.0 weights, 3.0 prompting; LLM as programmable interpreter; MenuGen "shouldn't exist";…
The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…

Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…

Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Verification as the New Bottleneck
Fiona Fung: coding is no longer the bottleneck — verification, review, maintenance are; shift-left; TDD loses its tax;…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…

Cited by 20

Acceleration Whiplash
Faros 2026: AI floods a human-paced SDLC with output it can't absorb — throughput up (tasks +34%, epics +66%), quality…
Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Agentic Technical Debt
Debt that *compounds* (not just accumulates) because each agentic-coding session re-derives architectural decisions wit…
AI as Primary Author
Faros 2026: the assistant→author threshold crossed without a deliberate decision, marked by AI-code acceptance rising 2…
Andrej Karpathy
Co-founder OpenAI, ex-Tesla AI, Eureka Labs; coined "vibe coding," Software 1/2/3.0, "ghosts not animals," "agentic eng…
Boris Cherny
Creator of Claude Code at Anthropic; phone-driven workflow with hundreds of agents; primary advocate of `/loop` primiti…
Building Is Cheap, Arguing Is Expensive
"In technical debate, code wins": generate three PRs vs whiteboard; prototype over design doc; reduce design docs
Claude Code
Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…
Claude Design
Anthropic Labs product (research preview, ~April 2026) for collaborating with Claude on polished visual artifacts — des…
Compute Allocator
The human's evolving role: deciding what's worth spending compute on; ~1% of generated tokens ship, 99% is scaffolding…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
Jagged Intelligence (Ghosts, Not Animals)
"Ghosts not animals": jagged statistical circuits, no intrinsic motivation; car-wash/strawberry failures; stay in the l…
Loop Engineering
Replacing yourself as the agent's prompter by designing the system that prompts it: a recursive-goal loop built from fi…
AI Engineering & Agent Tooling
Map of Content for the ai-engineering domain — 45 concepts. Curated entry point; see Home for all domains.
Open Questions Backlog
_124 pages with open questions, as of 2026-06-19._
Outsource Your Thinking, Not Your Understanding
"You can outsource your thinking but not your understanding"; understanding as the non-delegable human bottleneck; know…
Printing Press Software Democratization
Boris Cherny's analogy: 1400s literacy expansion → AI software-writing expansion; domain knowledge displaces coding ski…
Returns to Expertise in Agentic Coding
Anthropic's 400K-session study: domain expertise (not coding skill) is what amplifies an agent — experts get 2× the act…
Software 3.0
Karpathy's taxonomy: 1.0 code, 2.0 weights, 3.0 prompting; LLM as programmable interpreter; MenuGen "shouldn't exist";…
The Verifiability Thesis
LLMs automate what you can *verify* as computers automate what you can *specify*; RL verification rewards → jagged peak…