Sources#
Summary#
Google's AI research lab. In this corpus it appears as the lab behind AI-Driven Formal Proof Search — the team (George Tsoukalas, Anton Kovsharov, Sergey Shirobokov, Swarat Chaudhuri, Pushmeet Kohli et al.) that built AlphaProof Nexus and ran the first large-scale evaluation of LLM-aided formal proof search on open research mathematics (arXiv 2605.22763). It is also the maker of the Gemini model family used throughout (Gemini 3.1 Pro as prover, Gemini 3.0 Flash as rater), the prior AlphaProof olympiad theorem-prover, and AlphaEvolve, whose evolutionary design inspired Evolutionary Proof Search.
Role in the corpus#
DeepMind is the third frontier-lab "voice" in the wiki alongside Anthropic and OpenAI (Symphony / Agent Harness Engineering), and the one that opens the AI-for-mathematics domain. Its contribution is methodological as much as mathematical: the paper's finding that simple agentic loops increasingly rival DeepMind's own bespoke trained systems (Agentic Loops Overtake Bespoke Systems) is a candid, self-undercutting result — a lab that built specialized RL provers reporting that a plain LLM loop is catching up.
Systems and models referenced#
- Gemini 3.1 Pro / 3.0 Flash / 3.1 Flash-Lite — the LLM backbone; Pro for proving, Flash for rating; the smaller variants solved no problems (capability is sharply scale-gated — Scale-Dependent Prompt Sensitivity).
- AlphaProof — DeepMind's RL-trained olympiad-level Lean prover; used inside Nexus as a focused subgoal tool (and the system behind earlier IMO results).
- AlphaEvolve — the evolutionary-coding system whose population/diversity approach Evolutionary Proof Search adapts; also helped formulate the bipartite graph-reconstruction variants in the paper.
- Formal Conjectures repo — DeepMind's open-source Lean formalizations of Erdős problems, the benchmark for the Erdős runs.
Connections#
- AI-Driven Formal Proof Search — the paradigm DeepMind demonstrated at research scale
- AlphaProof Nexus — its framework
- Lean — the proof assistant it drives with Gemini
- Evolutionary Proof Search — adapts DeepMind's AlphaEvolve
- Agentic Loops Overtake Bespoke Systems — DeepMind's self-undercutting finding about its own bespoke systems
- Anthropic — peer frontier lab; the two anchor different domains in the corpus (alignment/coding vs. mathematics)
- Scale-Dependent Prompt Sensitivity — Gemini-model scale gating mirrors the broader model-capability-threshold theme
Open Questions#
- DeepMind reports its bespoke systems being caught by simple loops. Does the lab's comparative advantage move from systems to models + verifiers + benchmarks (mathlib, Formal Conjectures)?
- The paper opens AI-for-math; what's DeepMind's next target domain where a sound verifier exists?
Sources#
Cited by 3
- AlphaProof Nexus
DeepMind framework for LLM-aided Lean proof generation; four agents (basic→full-featured); proof-sketch + EVOLVE-BLOCK…
- Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
- Lean
Proof assistant whose compiler mechanically verifies every step; the `sorry` placeholder enables proof sketches; mathli…
Related articles
- AI-Driven Formal Proof Search
LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…
- Agentic Loops Overtake Bespoke Systems
DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…
- Client-Side Agent Optimization
AgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server…
- Agent Loop Pattern
`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, p…
- AlphaProof Nexus
DeepMind framework for LLM-aided Lean proof generation; four agents (basic→full-featured); proof-sketch + EVOLVE-BLOCK…
