“A first-principles walkthrough of TurboQuant, a fast quantization technique for neural networks.”
Worth reading even if you do not care about quantization. The author derives the technique from scratch instead of jumping into the optimized implementation, which is the right way to teach this kind of thing. More technical writing should look like this.