H
Howardismvol. 03 · quiet corner of the web
Howardism · Vol. 03Plate II · No. 02

Training, tagged.

Notes4TagTrainingOldest8 May 2026Newest8 May 2026

Every article tagged training, newest first.

Articles tagged Training, sorted by date, newest first.
TitleSummaryDate
Alignment Fine-Tuning (AFT)Standard post-pretraining stage (SFT + RLHF) for installing values; shallow-alignment failure mode motivates Model Spec Midtraining
Deliberative AlignmentGuan et al. 2025 (OpenAI): SFT on (prompt, CoT, response) tuples with spec-grounded CoT; strongest non-MSM baseline; risks compromising Cot Monitorability
Model Spec Midtraining (MSM)New training phase between pretrain and AFT: train base model on synthetic docs discussing the Model Spec; controls AFT generalization; cuts agentic misalignment 54%→7%; beats deliberative alignment baseline
Synthetic Document Finetuning (SDF)Wang et al. 2025 technique for modifying model beliefs via fine-tuning on synthetic documents; foundation that Model Spec Midtraining builds on