“Everything else is just efficiency. I cannot simplify this any further.”
Karpathy stripped GPT down to 200 lines of pure Python with zero dependencies. The whole thing trains on names and generates new ones. It is the best explanation of what LLMs actually are because it removes every excuse to not understand them. All the billion-dollar infrastructure, the massive GPU clusters, the endless hype. It is all just scaling this. The algorithmic core fits in a file shorter than most README files.