Software 3.0

Sources#

Andrej Karpathy: From Vibe Coding to Agentic Engineering

Summary#

Andrej Karpathy's taxonomy of three programming paradigms: Software 1.0 is explicit code; Software 2.0 is learned weights (you program by curating datasets, objectives, and architectures); Software 3.0 is prompting — the LLM is a programmable computer and "what's in the context window is your lever over the interpreter." Trained on the whole internet, the model implicitly multitasks every task in the data, becoming a general interpreter that performs computation in digital-information space. The deeper claim: 3.0 doesn't just make old programs faster, it makes whole classes of programs unnecessary — and enables information-processing tasks that couldn't exist before.

The three paradigms#

	Software 1.0	Software 2.0	Software 3.0
You write	Code	Datasets + objectives + architectures	Prompts / context
Executed by	CPU	Trained neural net	LLM as interpreter
The "program" is	Source	Weights	The context window

The OpenClaw installer example#

Installing a complex cross-platform tool used to mean a shell script that "balloons up and becomes extremely complex" to target every environment — Software 1.0 thinking. The 3.0 version: the install instructions are a block of text you copy-paste to your agent. The agent packages its own intelligence, inspects your machine, performs intelligent actions, and debugs in the loop. "What is the piece of text to copy-paste to your agent? That's the programming paradigm now." (See Agent-Native Infrastructure for the generalization.)

MenuGen: the app that shouldn't exist#

Karpathy built MenuGen — photograph a menu, OCR the items, generate images of each dish — as a real Vercel app with image-gen plumbing. Then he saw the 3.0 version: hand the photo to Gemini and say "use Nano Banana to overlay the dishes onto the menu," and the model returns the exact menu image with rendered dish pictures in the pixels. "All of my menu gen is spurious. That app shouldn't exist." The neural net subsumes the entire application; the prompt is just the image, the output is just the image, no app in between.

Beyond code: new information-processing tasks#

A subtler point: previous code operated over structured data. Software 3.0 enables operations that were never programs at all. His example is the LLM wiki: "there was no code that would create a knowledge base from a bunch of facts. Now you can take these documents and recompile them in a different way… something new as a reframing of the data." He calls this "more exciting" than mere speedup — not what we can do faster, but what couldn't be done before.

The extrapolation: neural net as host process#

Pushed to the limit: a "completely neural computer" — raw video/audio in, diffusion rendering a UI unique to that moment, the neural net as the host process and CPUs as the co-processor for deterministic appendages. He frames this as the 1950s–60s fork (calculator vs. neural net) flipping back: classical computing won the first round, neural nets are currently virtualized on it, and that relationship may invert. This is The Bitter Lesson taken to its architectural conclusion. He hedges that the path there is "TBD."

Connections#

Andrej Karpathy — originated the taxonomy
Agent-Native Infrastructure — the OpenClaw "copy-paste to your agent" install is the practical face of 3.0
Vibe Coding vs. Agentic Engineering — vibe coding is 3.0 with the floor lowered; agentic engineering is 3.0 done at professional quality
LLM-as-Compiler Knowledge Base — his canonical "new task that wasn't a program before" example
The Bitter Lesson — the neural-net-as-host-process extrapolation is Sutton's principle pushed to the hardware layer
The Verifiability Thesis — explains which 3.0 tasks work today (the verifiable ones)
HTML as the New Markdown — Thariq Shihipar's "build a throwaway UI per task" is a 3.0-native workflow; Disposable Micro-Apps are MenuGen-style apps-that-barely-exist
Interaction Models — Thinking Machines' "diffusion-rendered UI / neural computer" direction is a concrete step toward the host-process extrapolation

Open Questions#

Where is the line between "the app shouldn't exist" (MenuGen) and apps that should — i.e., when is deterministic 1.0/2.0 scaffolding still the right call vs. spurious?
The neural-net-as-host-process flip is presented as plausible-but-TBD. What would the first production system that genuinely inverts the CPU/NN relationship look like?

Sources#

Andrej Karpathy: From Vibe Coding to Agentic Engineering