Free Idea: Train on Changes, Not on Code

Train AI coding assistants on code changes, not static snapshots

LLMs trained on code snapshots learn to generate unnecessary complexity—factories, registries, backwards-compatibility patterns—that made sense in large systems but obscure small projects. Kent Beck proposes training on diffs and change sequences instead: the model learns safe, incremental refactorings and behavior changes that keep context windows manageable and prevent AI from getting stuck in complexity.

Read full essay on Substack ↗

Questions this essay answers

  • Why do AI coding assistants insert unnecessary design patterns like factories and registries?
  • How should we train LLMs to make small, safe changes instead of complex rewrites?
  • Can syntax tree transforms instead of text-based edits help AI code more safely?
← All essays