The hottest Substack posts of The End of Reckoning

And their main takeaways
157 implied HN points 07 Feb 23
  1. Large language models involve inscrutable matrices to produce sensible outputs.
  2. Internal representations in LLMs are not clearly understood.
  3. Transformers can predict sequences beyond language and scale performance with more parameters, computation, and data.
39 implied HN points 06 Feb 23
  1. The training process for transformers involves making many random guesses that get refined over time through tweaks based on reading a large amount of text.
  2. Training a model involves a lot of parameters that are adjusted with stochastic gradient descent and backpropagation.
  3. After pre-training, models like ChatGPT are further fine-tuned using reinforcement learning with human feedback to improve the quality of responses.
19 implied HN points 21 Feb 23
  1. Transformer models, like LLMs, are often considered black boxes, but recent work is shedding light on the internal processes and interpretability of these models.
  2. Induction heads in transformer models help with in-context learning and the ability to predict information based on the sequence of tokens seen before.
  3. By analyzing hidden states and conducting memory-based experiments, researchers are beginning to understand how transformer models store and manipulate information, providing insights into how these models may represent truth internally.