Howardismvol. 03 · quiet corner of the web

Plate II機器翻譯 · machine-translatedENHOWARDISM

Thinking Machines Lab

PublishedMay 13, 2026FiledEntityTagsType/entityAI LabReading3 minSourceAI-synthesised

Interaction Models 背後的 AI 研究實驗室（2026 年 5 月）；harness 融入模型的論點；將 streaming-sessions 上游貢獻至 SGLang；與 GPT-realtime / Gemini-live 的基準比較；研究補助開放中

Thinking Machines Lab 的插圖

資料來源#

Interaction Models: A Scalable Approach to Human-AI Collaboration

是什麼#

一間 AI 研究實驗室（以「Thinking Machines Lab: Connectionism」名義發表）。在本 wiki 中，它首次出現是作為 Interaction Models 背後的組織——一項 2026 年 5 月的 research preview，將即時人機協作重新定義為模型原生能力，而非 harness 層面的問題。

他們發表／主張了什麼（如本站所見）#

Interaction Models（2026 年 5 月 research preview）——能原生接收音訊／影像／文字，並即時思考、回應、行動的模型。首個模型：TML-Interaction-Small（276B MoE，12B 活躍參數）。
立場：互動性應隨智慧一同擴展 → 它必須內建於模型中，援引 The Bitter Lesson 反對基於 harness 的即時系統（VAD、輪次偵測）。
工程足跡：將 streaming-sessions 功能上游貢獻至 SGLang；發表了關於消除 LLM 推論中的非確定性的研究（batch-invariant kernels），被引用於 trainer-sampler 對齊；先前發表 On-Policy Distillation。
正在進行一項互動性／人機協作基準的研究補助（細節待公布）；interaction model 的有限 research preview 將於「未來幾個月」推出，更廣泛的發布「今年稍後」；更大的模型預計於 2026 年稍後推出。

如何連結#

與 Turn-Based Interface Bottleneck 中被批評的實驗室立場不同（「AI 實驗室過度優化自主性」）——隱含地與 Anthropic 和 OpenAI 的 agent 產品（Claude Code、Symphony）所呈現的自主性優先框架對立。
他們的「harness 融入模型」立場與 Harness Shrinkage as Models Improve（Anthropic／Claude Code 的觀察）形狀相同——來自不同實驗室的趨同思維。
將其模型與 GPT-realtime-2.0 / 1.5（OpenAI）和 Gemini-3.1-flash-live（Google）以及 Qwen 3.5 Omni 進行基準比較——見 Interactivity Benchmarks。

相關連結#

Interaction Models — 他們的標誌性 research preview
TML-Interaction-Small — 該模型
The Bitter Lesson — 他們援引的原則
Turn-Based Interface Bottleneck — 他們對現狀的批評
Interactivity Benchmarks — 與 OpenAI / Google / Alibaba 模型的基準比較
Harness Shrinkage as Models Improve — 來自 Anthropic 的趨同論點
Anthropic — 同儕實驗室；不同的優先順序（自主性優先 vs. 互動性優先）
Agent Harness Engineering — 他們的 interaction-models 研究將即時互動層的 harness vs. 模型問題導向模型端解決

資料來源#

Interaction Models: A Scalable Approach to Human-AI Collaboration

§ end

About this piece

Articles in this journal are synthesised by AI agents from a curated wiki and are refreshed automatically as new concepts arrive. Topics, framing, and editorial direction are curated by Howardism.

Cited by 10

Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Opinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
HTML as the New Markdown
Thariq Shihipar's thesis: as models improve, thousand-line markdown plans overwhelm the *human*; HTML artifacts (visual…
Interaction Models
Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…
Interactivity Benchmarks
FD-bench, Audio MultiChallenge + new TimeSpeak/CueSpeak (proactive audio) and RepCount-A/ProactiveVideoQA/Charades (vis…
Entities — People, Orgs, Tools & Projects
Map of Content for all 32 entity pages. See Home for concept domains.
TML-Interaction-Small
TML's first interaction model: 276B MoE / 12B active, audio+video+text in / text+audio out, 200ms micro-turns, async ba…
Turn-Based Interface Bottleneck
Why current AI interfaces limit collaboration: single-thread turn-taking is a bandwidth bottleneck; humans pushed out b…

Related articles

Interaction Models
Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…
Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
Context Window Smart Zone
Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…

Related articles

Interaction Models
Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…
Claude Opus 4.7
GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokeniz…
Context Window Smart Zone
Smart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
The Bitter Lesson
Sutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolv…

Cited by 10

Agent Harness Engineering
Patterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical archit…
Opinions on Using AI Tools & the Future of the Software Engineering Role
Debate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architec…
Anthropic
AI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs r…
Harness Shrinkage as Models Improve
Prompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from…
HTML as the New Markdown
Thariq Shihipar's thesis: as models improve, thousand-line markdown plans overwhelm the *human*; HTML artifacts (visual…
Interaction Models
Thinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via…
Interactivity Benchmarks
FD-bench, Audio MultiChallenge + new TimeSpeak/CueSpeak (proactive audio) and RepCount-A/ProactiveVideoQA/Charades (vis…
Entities — People, Orgs, Tools & Projects
Map of Content for all 32 entity pages. See Home for concept domains.
TML-Interaction-Small
TML's first interaction model: 276B MoE / 12B active, audio+video+text in / text+audio out, 200ms micro-turns, async ba…
Turn-Based Interface Bottleneck
Why current AI interfaces limit collaboration: single-thread turn-taking is a bandwidth bottleneck; humans pushed out b…