Skip to content
H
Howardismvol. 03 · quiet corner of the web
Howardism · Vol. 03Plate II · No. 02

Writing, in order.

Pieces59Sections4Oldest10 Apr 2026Newest13 May 2026
02
Plate II · Writing

A dense index of every article in the wiki, grouped by kind. Hover any title for a preview, click to read.

Concept

39 pieces
Concept articles, sorted by date, newest first.
TitleSummaryDate
Encoder-Free Early FusionMultimodal design with minimal pre-processing instead of large standalone encoders: dMel audio embedding, 40×40-patch hMLP for frames, flow head for audio out, all co-trained from scratch in one transformer
Full-Duplex InteractionPerceive-and-respond simultaneously across modalities; proactive interjection, visual-cue reactions, simultaneous speech, live translation/commentary, time-aware speech — all special cases of model behavior
Interaction / Background Model SplitDual-model architecture: time-aware interaction model stays present; async background model handles deep reasoning/tools; rich-context-package delegation; "reasoning-model planning at non-thinking latency"
Interaction ModelsThinking Machines Lab (May 2026): models that handle audio/video/text interaction natively in real time instead of via harness; interactivity scales with intelligence only if it's in the model
Interactivity BenchmarksFD-bench, Audio MultiChallenge + new TimeSpeak/CueSpeak (proactive audio) and RepCount-A/ProactiveVideoQA/Charades (visual proactivity); TML-Interaction-Small: 0.40s turn-taking latency, dominates interaction quality
The Bitter LessonSutton 2019: scaled general methods beat hand-engineered structure; recurring justification across the wiki for dissolving harnesses into models; caveat — mechanical verification and character may not migrate inward
Time-Aligned Micro-TurnsThe core interaction-model move: input/output as continuous streams in ~200ms interleaved chunks, no turn boundaries; streaming-sessions inference (upstreamed to SGLang), latency-tuned MoE kernels, bitwise trainer-sampler alignment
Turn-Based Interface BottleneckWhy current AI interfaces limit collaboration: single-thread turn-taking is a bandwidth bottleneck; humans pushed out by the interface, not the work; less-intelligent harness (VAD/turn-detection) should dissolve
Agentic Misalignment (AM)Lynch et al. 2025 eval and threat model: LLM email-agent discovers it may be deleted, can take harmful actions; OOD relative to conversational AFT; primary eval surface for Model Spec Midtraining
AI Brain FryKropp et al. 2026/03: mental fatigue from excessive AI oversight increases minor errors +11%, major errors +39%; cognitive cost surface for both tool and employee framings
AI Employee FramingKropp et al. (HBR May 2026, n=1,261): framing AI agents as "employees" vs "tools" cuts personal accountability −9pp, increases escalation +44%, reduces error catching −18%, no adoption gain
Alignment Fine-Tuning (AFT)Standard post-pretraining stage (SFT + RLHF) for installing values; shallow-alignment failure mode motivates Model Spec Midtraining
Chain-of-Thought MonitorabilityKorbak et al. 2025: chain-of-thought traces are a fragile monitor; direct CoT training compromises faithfulness; MSM offers an alternative path
Deliberative AlignmentGuan et al. 2025 (OpenAI): SFT on (prompt, CoT, response) tuples with spec-grounded CoT; strongest non-MSM baseline; risks compromising Cot Monitorability
Human-AI Accountability RedesignHBR five-pillar prescription: span-of-control redesign, role redesign, performance management reset, decision-rights/escalation/consequences, agentic-unit-not-human-role design
Model Spec Midtraining (MSM)New training phase between pretrain and AFT: train base model on synthetic docs discussing the Model Spec; controls AFT generalization; cuts agentic misalignment 54%→7%; beats deliberative alignment baseline
Model Spec ScienceEmpirical study of which Model Spec features best generalize alignment; value explanations > rules alone, specific > general "be ethical" framing; first concrete examples in Li et al. 2026
Synthetic Document Finetuning (SDF)Wang et al. 2025 technique for modifying model beliefs via fine-tuning on synthetic documents; foundation that Model Spec Midtraining builds on
Agent Loop Pattern`/loop` (cron-scheduled) and Ralph Wiggum (backlog-draining) loops as next-generation agent primitive; AFK execution, parallel fan-out, "loops are the future"
AI Native Product CadenceCat Wu's 6mo→1mo→1day cadence at Anthropic: research-preview branding, mission-as-tiebreaker, evergreen launch room, lighter PRDs, weekly metrics readouts
Claude Character as ProductPersonality as load-bearing product surface; Amanda's role at Anthropic; lunchtime vibe-checks as eval discipline; the harness asset that *doesn't* shrink
Context Window Smart ZoneSmart zone vs dumb zone (Dex Hardy / Matt Pocock): quadratic attention scaling, ~100K marker independent of advertised context; clear-and-restart > compaction; status-line token counting as essential discipline
Deep Modules for AgentsOusterhout deep-vs-shallow modules applied to agent-friendly codebases; push-vs-pull instruction delivery; reviewer in fresh context; Sandcastle three-agent pattern
Design Concept GrillingMatt Pocock's `grill-me` skill; reach Brooks "design concept" before any plan; counter to specs-to-code; PRD as destination doc, Kanban as journey doc
Engineer PM ConvergenceGeneralists across disciplines; product taste as bottleneck skill; Anthropic Claude Code team as case study; "just do things" cultural substrate
Harness Shrinkage as Models ImprovePrompt scaffolding shrinks each model release; Cat Wu's pruning discipline; Boris Cherny "100 lines of code a year from now" claim; mechanical verification stays load-bearing
Model Introspection FeedbackCat Wu's underrated technique: ask the model why it failed; treat answer as harness-debugging signal not model criticism; caveats around model self-report fidelity
Printing Press Software DemocratizationBoris Cherny's analogy: 1400s literacy expansion → AI software-writing expansion; domain knowledge displaces coding skill; 10× more disruption-grade startups predicted
Seven Powers Applied to AIHelmer/Acquired framework re-evaluated for AI: switching costs and process power erode; network effects, scale, cornered resources persist; counter-positioning amplifies
Vertical Slice Tracer BulletsPragmatic-Programmer tracer-bullet pattern applied to agent task decomposition; vertical slices > horizontal layers; Kanban-with-blocking-edges over numbered phase plans
Codex App Server ProtocolJSON-RPC stdio protocol for headless Codex sessions: initialize/initialized/thread-start/turn-start handshake, continuation turns reuse thread_id, dynamic tool calls for token-isolated tool injection
Ticket-Driven Agent OrchestrationThe inversion that makes Symphony work: tickets as units of work (not sessions/PRs), DAG dependencies, agent-extensible work graph, "objectives not transitions"
Claude Code Auto ModeClaude Code permission mode using a classifier to auto-approve safe tool calls and block risky ones; middle ground between default and `--dangerously-skip-permissions`
Client-Side Agent OptimizationAgentOpt's framing of developer-controlled agent optimization (model-per-role, budget, routing) as distinct from server-side serving; the combo abstraction; 13–32× cost gaps between best/worst combinations
Scale-Dependent Prompt SensitivityLarge models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26pp and fully reverse hierarchy on GSM8K/MMLU-STEM
Agent Harness EngineeringPatterns for scaffolding long-running LLM agents: environment design, progressive context disclosure, mechanical architecture enforcement, agent code review
Claude Code Best PracticesAnthropic's guide to effective Claude Code usage: context management, verification-driven development, explore→plan→code workflow, environment config
LLM-as-Compiler Knowledge BaseKarpathy's architecture: LLM incrementally compiles raw docs into a persistent interlinked wiki, replacing RAG with a 4-phase ingest→compile→query→lint pipeline
LLM-Driven Vulnerability ResearchClaude Mythos Preview's emergent cybersecurity capabilities: autonomous zero-day discovery, full exploit chains, and Anthropic's Project Glasswing response

Entity

14 pieces
Entity articles, sorted by date, newest first.
TitleSummaryDate
Thinking Machines LabAI research lab behind interaction models (May 2026); harness-dissolves-into-model thesis; upstreamed streaming-sessions to SGLang; benchmarks against GPT-realtime / Gemini-live; research grants open
TML-Interaction-SmallTML's first interaction model: 276B MoE / 12B active, audio+video+text in / text+audio out, 200ms micro-turns, async background agent; best turn-taking latency of any model; research preview May 2026
Chloe LiLead author of MSM paper (arXiv 2605.02087); Anthropic Fellows Program; designed all specs and experiments
Claude's Constitution / Model SpecAnthropic Model Spec / Constitution by Askell et al.; document specifying Claude's values + hard constraints (SP1–3, GP1–2); now also a direct training input via MSM
AnthropicAI safety company / vendor of Claude; mission-as-tiebreaker culture; ~30–40 PMs across teams; Mike Krieger leads Labs round 2
Boris ChernyCreator of Claude Code at Anthropic; phone-driven workflow with hundreds of agents; primary advocate of `/loop` primitive; "coding is solved (for me)" thesis
Cat WuHead of Product for Claude Code and Cowork at Anthropic; primary articulator of AI-native product cadence and engineer-PM convergence
Claude CodeAnthropic's agentic coding product; created by Boris Cherny late 2024; TypeScript/React; CLI/desktop/web/mobile/IDE surfaces; central tool across all 2026 sources
CoworkAnthropic's non-code knowledge-work agent product; sibling to Claude Code; output is decks/inbox/dossiers; same MCP/computer-use primitives
Matt PocockIndependent AI-coding educator; built Sandcastle library; smart-zone/grill-me/tracer-bullets pedagogical framing; "bad code bases make bad agents"
Mythos ModelAnthropic preview-tier frontier model; gated for safety; used internally alongside Opus 4.7; descendant expected to ship publicly later
Hermes AgentNous Research's CLI agent + Gateway daemon (Telegram/Discord/Slack/WhatsApp); AGENTS.md/SOUL.md context split, bounded memory files, DM-pairing auth, container-as-security-boundary model
SymphonyOpenAI's open-source agent orchestrator (March 2026): turns Linear into a control plane for Codex, per-issue workspace, daemon-driven, SPEC.md-as-product, hedged 500% landed-PRs claim
Claude Opus 4.7GA frontier model from Anthropic; direct upgrade to 4.6 at same price; literal instruction following, 1.0–1.35× tokenizer inflation, new `xhigh` effort, first post-Glasswing safeguards

Essay

5 pieces
Essay articles, sorted by date, newest first.
TitleSummaryDate
Opinions on Using AI Tools & the Future of the Software Engineering RoleDebate map of four stances on using AI tools (bullish-insider / pragmatist-practitioner / skeptic-governance / architecture-thesis) + synthesis on the future SWE role: coding→deciding/verifying, role convergence, what stays human, which moats survive, honest caveats
Learning to Co-Work with AI: A Software Engineer's Field GuideField guide for software engineers in the AI era: 6 skill clusters (taste, harness, alignment-first planning, agent-friendly architecture, verification, strategic positioning), daily practices, anti-patterns, 90-day plan
Opus 4.6 → 4.7 Changes and Multi-Agent Coding Considerations4.6→4.7 delta table + six hazards for multi-agent coding teams: role-based model selection, prompt re-tuning, harness invariants, per-agent context budget, unattended-fan-out safety, independent reviewer
When to Use Claude Opus 4.6 for WorkDecision rules for Opus 4.6 deployment: solver-not-planner, elaboration-load-bearing tasks, brevity constraints, Pareto frontier check
What Are AI Tools?Overview of AI tools landscape and categories

Index

1 piece
Index articles, sorted by date, newest first.
TitleSummaryDate
Operations LogChronological record of wiki operations. Each entry uses the format: `## [YYYY-MM-DD] operation | Subject`