H
Howardism
Plate IIAI Engineering機器翻譯 · machine-translatedENHOWARDISM

Agent Loop Pattern

PublishedMay 6, 2026FiledConceptDomainAI EngineeringTagsAgent EngineeringHarnessAutomationReading7 minSourceAI-synthesised

`/loop` (cron-scheduled) 與 Ralph Wiggum (backlog-draining) 迴圈作為次世代 agent 基本要素;AFK 執行、parallel fan-out,「迴圈就是未來」

Agent Loop Pattern 插圖

資料來源#

摘要#

迴圈是一個 agent 程序,它會重複執行提示詞,直到佇列為空或達到停止條件。截至 2026 年年中,三種收斂的實作方式都指出迴圈正在成為與 single-shot session 平起平坐的基本要素:Anthropic 的 /loop 斜線指令(cron-scheduled,重複執行)、Anthropic 的 routines(伺服器端 /loop),以及 Matt Pocock 的 Ralph Wiggum loop(在 while 迴圈中使用 bash + claude --permission-mode accept-edits)。Boris Cherny 將迴圈稱為「未來」;Matt Pocock 則將它們用作他 end-to-end 工作流的 AFK 骨幹。

The two loop families#

Cron-scheduled loops (/loop, routines)#

Claude CodeCowork 中使用。機制:agent 呼叫 cron(透過工具)來排程未來時間的工作;該工作在該時間點帶著執行任務的指令重新進入 agent。排程可以重複(每分鐘、每 5 分鐘、每天)。

Boris Cherny 提到的使用場景:

  • 照看 PR — 修復 CI、auto-rebase
  • 保持 CI 健康 — 修復 flaky tests
  • 每 30 分鐘將 Twitter 的回饋進行聚類
  • 「隨時有數十個迴圈在執行」
  • 夜間:「數千個 agents」進行更深度的運作

Routines 是伺服器上的相同基本要素,因此即使筆記型電腦關閉也能繼續執行。

Backlog-draining loops (Ralph Wiggum loop)#

Matt Pocock 等人使用。機制:一個 shell 腳本以固定的提示詞執行 agent,該提示詞指示它從待辦清單中挑選下一個任務並完成它,然後腳本重新啟動。待辦清單是一個包含 markdown issue 檔案的目錄(或 GitHub issues)。

Pocock 的 once.sh 骨架:

issues=$(cat issues/*.md)
recent_commits=$(git log -5 --oneline)
prompt=$(cat prompt.md)
claude --permission-mode accept-edits "$prompt" --context "$issues" "$recent_commits"

此「迴圈」包裝器只是重複執行 once.sh,直到 agent 發出哨兵值(no more tasks)或 harness 停止它。

提示詞強制執行 AFK-only 任務選擇 — 只有標記為 AFK(相對於 human-in-loop)的任務才符合條件。

Why loops matter#

  1. 將規劃成本分攤到多次執行中。 一次仔細的規劃 session(例如透過 Design Concept Grilling)會建立一個 Kanban 待辦清單(參見 Vertical Slice Tracer Bullets);迴圈會在沒有進一步人類輸入的情況下清理它。
  2. 長達數小時的任務變得可行。 與其使用一個巨大的 context window,迴圈將工作拆分到許多全新的 sessions 中 — 每次都保持在 Context Window Smart Zone 中。
  3. 平行化。 獨立的待辦清單項目會同時在不同的沙箱中執行。Pocock 的 Sandcastle 函式庫透過在 Docker 容器中為每個 issue 建立 git worktrees 來實現這一點;merger agent 隨後進行協調。
  4. 閒置算力。 Boris 的夜間設定是在便宜的閒置時間執行一個包含一千個 agent 的迴圈 — 雖然這項工作不值得人類花費晚上時間,但能以 agent 成本產生價值。

AFK vs human-in-loop tasks#

Matt Pocock 的關鍵區分:

  • AFK 任務 — 實作、重構、測試腳手架、文件維護、CI 自動修復。agent 無需逐步核准即可成功;驗證是自動的(測試、型別、Linter)。
  • Human-in-loop 任務 — 對齊、設計選擇、優先級排序、QA。這些任務沒有機械式的驗證;它們需要品味與隱性 context。

迴圈適用於 AFK 類別。嘗試對 human-in-loop 工作進行迴圈會產生 drift — agent 會做出看似合理但錯誤的決定並不斷累積它們。

Verification is the ceiling#

Pocock 更強烈的論點:feedback loops 的品質決定了迴圈能力的上限。 沒有良好的測試、型別和 Linter,迴圈就像是「盲目編寫程式碼」。這與 Agent Harness Engineering 中關於機械式強制執行的論點相同 — 迴圈只是更赤裸地暴露出成本,因為沒有人類來捕捉 drift。

Connection to model trajectory#

Boris Cherny 報告指出,Opus 4.7 會在沒有提示的情況下自發地啟動迴圈:

「我會告訴它:『去抓取這個數據查詢。』而它會回答:『嘿,我注意到數據隨時間在變化。我將啟動一個迴圈,並每 30 分鐘為你提供一份報告。』」

這契合了 Harness Shrinkage as Models Improve — 過去由 harness 注入的能力變成了模型自然展現的行為。迴圈的基本要素依然存在,但使用者不再需要手動呼叫它。

相關連結#

待解決的問題#

  • 當模型自己排程其迴圈時(4.7 行為),誰來負責預算?Boris 回答「模型自己決定」— 但這把成本約束推給了模型的訓練,而不是 harness。
  • 一個搭配足夠聰明模型的迴圈是否仍需要 Kanban 待辦清單,還是模型會自己從原始目標中選擇下一個任務?
  • 迴圈產出的審查現在是 Matt Pocock 坦承的瓶頸 —「我們只需要準備好進行更多的程式碼審查。」

資料來源#

§ end
About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 26
  • Agent Control Plane Patterns: Tickets, Loops, Specs, and Memory Files

    Layered agent control-plane synthesis: tickets as durable work graph, loops as execution primitive, specs/context files…

  • Agent Harness Engineering

    Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…

  • Agent-Native Infrastructure

    The world is still built for humans and must be rewritten for agents; "what do I copy-paste to my agent?"; sensors/actu…

  • Agentic Loops Overtake Bespoke Systems

    DeepMind's *basic* Ralph-loop agent matched its bespoke evolutionary+AlphaProof system as the LLM improved; the bitter…

  • Agentic Misalignment (AM)

    Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD rel…

  • AI Brain Fry

    Kropp et al. 2026/03: mental fatigue from excessive AI oversight increases minor errors +11%, major errors +39%; cognit…

  • AI-Driven Formal Proof Search

    LLM generates Lean, compiler verifies every step → eliminates hallucination; DeepMind resolves 9/353 Erdős + 44/492 OEI…

  • Opinions on Using AI Tools & the Future of the Software Engineering Role

    Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…

  • AlphaProof Nexus

    DeepMind framework for LLM-aided Lean proof generation; four agents (basic→full-featured); proof-sketch + EVOLVE-BLOCK…

  • Boris Cherny

    Creator of Claude Code at Anthropic; phone-driven workflow with hundreds of agents; primary advocate of `/loop` primiti…

  • Claude Code

    Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…

  • Claude Code Auto Mode

    Claude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground betw…

  • Claude Code Best Practices

    Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…

  • Claude Opus 4.7

    GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…

  • Context Window Smart Zone

    Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…

  • Deep Modules for Agents

    Ousterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in…

  • Design Concept Grilling

    Matt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destina…

  • Engineer PM Convergence

    Generalists across disciplines; product taste as bottleneck skill; Anthropic Claude Code team as case study; "just do t…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • Human-AI Accountability Redesign

    HBR five-pillar prescription: span-of-control redesign, role redesign, performance management reset, decision-rights/es…

  • Learning to Co-Work with AI: A Software Engineer's Field Guide

    Field guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-fri…

  • Matt Pocock

    Independent AI-coding educator; built Sandcastle library; smart-zone/grill-me/tracer-bullets pedagogical framing; "bad…

  • AI Engineering & Agent Tooling

    Map of Content for the ai-engineering domain — 36 concepts. Curated entry point; see Home for all domains.

  • Open Questions Backlog

    _96 pages with open questions, as of 2026-06-14._

  • Symphony

    OpenAI's open-source agent orchestrator (March 2026): turns Linear into a control plane for Codex, per-issue workspace,…

  • Vertical Slice Tracer Bullets

    Pragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Ka…

Related articles
  • Agent Harness Engineering

    Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…

  • Harness Shrinkage as Models Improve

    Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…

  • Claude Code Best Practices

    Anthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→cod…

  • Claude Code

    Anthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE sur…

  • Context Window Smart Zone

    Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…