Software Bits Newsletter • 257 implied HN points • 29 Dec 25
- Associativity is the key property that lets you split work, combine partial results, and safely parallelize or stream computations without changing the answer.
- Softmax has a hidden associative state — tracking a local max and a scaled sum lets you correct and merge chunked results, which is the math behind FlashAttention’s memory- and time-saving trick.
- When optimizing a global computation, look for a small combinable state and an associative combine rule; if it exists you can chunk and parallelize, and if it doesn’t (for example, median) you need a different algorithmic approach.