Papyros

Archive / essay

The Bitter Lesson

The biggest lesson that can be read from 70 years of AI research is that general methods that leverage computation are ultimately the most effective.

Knowledge loses to compute

Sutton argues researchers keep embedding human insight into systems (chess heuristics, linguistic features, vision pipelines) and keep getting overtaken by search and learning at scale. The bitter part: our cleverness wastes time.

History as evidence

Chess, Go, speech, vision, language: the arc repeats. Methods that use more data and compute win. Methods that encode expert structure plateau.

After transformers

Read in 2019 it sounded like reinforcement learning's manifesto. After GPT and scaling laws it reads like prophecy. Whether it stays true forever is another question Sutton would probably leave to compute.