資料來源#
摘要#
Anthropic Institute 是 Anthropic 的研究與政策部門,專注於 frontier AI 對社會與治理的影響。它發表了 When AI builds itself(2026 年 6 月)——這是本 Wiki 關於 Recursive Self-Improvement 的主要來源——並擁有一項 公開議程,旨在與他人合作建立可信的 AI 減速或暫停所需的系統(Frontier Pause Verification)。
主要工作#
- 面向大眾的軌跡分析。 When AI builds itself 結合了公開基準測試(Task Time-Horizon Scaling)與先前未公開的 Anthropic 內部數據(AI Accelerating AI Development),以論證 AI 已經在加速 AI 的研發,並為 RSI 描繪了三種未來。
- 協調基礎設施。 它計劃「與許多人合作進行研究並採取行動,以協助建立可信的減速或暫停所需要的系統」:驗證其他開發者是否確實停止,以及確保惡意行為者無法利用協同減速在暗中超越(Frontier Pause Verification)。
- 召集。 在文章發表後的幾個月內,該機構計劃組織決策者、研究人員、公民社會以及其他 AI 公司之間的對話,並公布其結果——明確邀請 AI 公司以外的聲音參與討論。
相關人員#
- Marina Favaro 與 Jack Clark 共同撰寫了 When AI builds itself(由 Santi Ruiz 提供編輯支持;視覺設計由 Shan Carter、Romello Goodman、Nikki Makagiansar 製作,數據源自 Brian Calvert 與 Jun Shern Chan)。
相關連結#
- Anthropic —— 母組織
- Recursive Self-Improvement —— 該機構旗艦文章的主題
- Frontier Pause Verification —— 該機構的具體治理議程
- AI Accelerating AI Development —— 該文章所依據的內部證據庫
- Responsible Scaling Policy Evaluations —— 該機構的外部協調工作補充了 Anthropic 內部的 RSP 煞車機制
未決問題#
- 該機構的政策姿態(傾向保留暫停的選擇權)如何與 Anthropic 交付 frontier models 的商業動機相互作用?該文章承認了競爭與地緣政治的壓力,但並未解決此問題。
- 該機構將原型化哪些具體的驗證機制?相對於其警告的 RSI 趨勢,其時間表為何?
資料來源#
- When AI builds itself —— Anthropic Institute, When AI builds itself (Marina Favaro & Jack Clark, 2026 年 6 月)
Cited by 10
- AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
- AI R&D Autonomy Evaluation (AECI)
How Anthropic measures whether a model can automate or dramatically accelerate AI research — the capability that drives…
- Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
- Frontier Pause Verification
The arms-control problem of a credible, verifiable slowdown or pause of frontier AI: detectability is harder than for o…
- LLM-Driven Vulnerability Research
Claude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and An…
- METR
Independent AI-evaluation org behind the 'time horizons' benchmark — the task length a model can complete reliably on i…
- Entities — People, Orgs, Tools & Projects
Map of Content for all 32 entity pages. See Home for concept domains.
- Mythos Model
Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…
- Open Questions Backlog
_96 pages with open questions, as of 2026-06-14._
- Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…
Related articles
- AI Accelerating AI Development
The empirical core of *When AI builds itself*: measured evidence AI already speeds AI R&D at Anthropic — >80% of merged…
- Recursive Self-Improvement
An AI system autonomously designing and developing its own successor; Anthropic Institute's *When AI builds itself* arg…
- Mythos Model
Anthropic preview-tier frontier model and the first member of the Mythos-class tier (above Opus); gated for safety, use…
- Responsible Scaling Policy Evaluations
Anthropic's RSP gates deployment on pre-release capability evaluations in CBRN, automated AI R&D, and high-stakes misal…
- Claude Opus 4.8
Anthropic's most capable general-access model (May 2026); upgrade on Opus 4.7 in SWE/agentic/knowledge work; does not a…
