The hottest Systems Substack posts right now

And their main takeaways
Category
Top Business Topics
Software Bits Newsletter β€’ 0 implied HN points β€’ 07 Jan 26
  1. Sparsity means many weights or activations are zero so you can skip their multiplications, but random/unstructured zeros usually don’t make GPUs faster because irregular memory access and load imbalance kill performance.
  2. Hardware-friendly patterns like 2:4 sparsity and block sparsity let accelerators actually speed up computation, while pruning and ReLU-driven activation sparsity often need structure or predictive gating to become efficient.
  3. Conditional computation (Mixture of Experts) is the most powerful practical sparsity: only a few experts run per input, giving huge model capacity with much less active compute and strong empirical results.