The hottest Substack posts of Jake Ward's Blog

And their main takeaways
2 HN points 30 Apr 24
  1. Large language models like ChatGPT have complex, learned logic that is difficult to interpret due to 'superposition' - where single neurons correspond to multiple functions.
  2. Techniques like sparse dictionary learning can decompose artificial neurons into 'features' that exhibit 'monosemanticity', making the models more interpretable.
  3. Reproducing research on model interpretability shows promise for breakthroughs and indicates a shift towards engineering challenges over scientific barriers.
0 implied HN points 29 Apr 24
  1. GANs like StyleGAN2 can produce highly realistic images, but training them requires significant resources like powerful GPUs and large datasets.
  2. Building a dataset for training a model like StyleGAN2 can involve sourcing high-quality imagery, using tools like GIS, and ensuring data cleanliness.
  3. Even with limited resources, it's possible to achieve reasonable performance on state-of-the-art networks by optimizing training methods and creatively working around constraints.