Howardismvol. 03 · quiet corner of the web

Plate IIEntities機器翻譯 · machine-translatedENHOWARDISM

Anthropic Institute

PublishedJune 7, 2026FiledEntityDomainEntitiesTagsEntityOrgAI PolicyGovernanceAnthropicReading3 minSourceAI-synthesised

Anthropic 的政策與治理研究部門；發表了關於 Recursive Self-Improvement 的 *When AI builds itself* (Favaro & Clark, 2026)；其議程包括建立可信的多邊 AI 減速所需要的驗證系統

Anthropic Institute 的插圖

資料來源#

When AI builds itself

摘要#

Anthropic Institute 是 Anthropic 的研究與政策部門，專注於 frontier AI 對社會與治理的影響。它發表了 When AI builds itself（2026 年 6 月）——這是本 Wiki 關於 Recursive Self-Improvement 的主要來源——並擁有一項公開議程，旨在與他人合作建立可信的 AI 減速或暫停所需的系統（Frontier Pause Verification）。

主要工作#

面向大眾的軌跡分析。 When AI builds itself 結合了公開基準測試（Task Time-Horizon Scaling）與先前未公開的 Anthropic 內部數據（AI Accelerating AI Development），以論證 AI 已經在加速 AI 的研發，並為 RSI 描繪了三種未來。
協調基礎設施。 它計劃「與許多人合作進行研究並採取行動，以協助建立可信的減速或暫停所需要的系統」：驗證其他開發者是否確實停止，以及確保惡意行為者無法利用協同減速在暗中超越（Frontier Pause Verification）。
召集。 在文章發表後的幾個月內，該機構計劃組織決策者、研究人員、公民社會以及其他 AI 公司之間的對話，並公布其結果——明確邀請 AI 公司以外的聲音參與討論。

相關人員#

Marina Favaro 與 Jack Clark 共同撰寫了 When AI builds itself（由 Santi Ruiz 提供編輯支持；視覺設計由 Shan Carter、Romello Goodman、Nikki Makagiansar 製作，數據源自 Brian Calvert 與 Jun Shern Chan）。

相關連結#

Anthropic —— 母組織
Recursive Self-Improvement —— 該機構旗艦文章的主題
Frontier Pause Verification —— 該機構的具體治理議程
AI Accelerating AI Development —— 該文章所依據的內部證據庫
Responsible Scaling Policy Evaluations —— 該機構的外部協調工作補充了 Anthropic 內部的 RSP 煞車機制

未決問題#

該機構的政策姿態（傾向保留暫停的選擇權）如何與 Anthropic 交付 frontier models 的商業動機相互作用？該文章承認了競爭與地緣政治的壓力，但並未解決此問題。
該機構將原型化哪些具體的驗證機制？相對於其警告的 RSI 趨勢，其時間表為何？

資料來源#

When AI builds itself —— Anthropic Institute, When AI builds itself (Marina Favaro & Jack Clark, 2026 年 6 月)

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 10

AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
Frontier Pause Verification
The arms-control problem of a credible, verifiable slowdown or pause of frontier AI: detectability is harder than for o…
LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
METR
Independent AI-evaluation org behind the 'time horizons' benchmark — the task length a model can complete reliably on i…
Entities — People, Orgs, Tools & Projects
Map of Content for all 32 entity pages. See Home for concept domains.
Mythos Model
Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…

Related articles

AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…
Mythos Model
Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…
Responsible Scaling Policy Evaluations
Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…
Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…

Related articles

AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…
Mythos Model
Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…
Responsible Scaling Policy Evaluations
Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…
Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…

Cited by 10

AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
Frontier Pause Verification
The arms-control problem of a credible, verifiable slowdown or pause of frontier AI: detectability is harder than for o…
LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
METR
Independent AI-evaluation org behind the 'time horizons' benchmark — the task length a model can complete reliably on i…
Entities — People, Orgs, Tools & Projects
Map of Content for all 32 entity pages. See Home for concept domains.
Mythos Model
Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…
Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…