The hottest AI Alignment Substack posts right now

And their main takeaways
Category
Top Business Topics
Astral Codex Ten β€’ 5574 implied HN points β€’ 15 Jan 24
  1. Weekly open thread for discussions and questions on various topics.
  2. AI art generators still have room for improvement in handling tough compositionality requests.
  3. Reminder about the PIBBSS Fellowship, a fully-funded program in AI alignment for PhDs and postdocs from diverse fields.
Joe Carlsmith's Substack β€’ 78 implied HN points β€’ 11 Jan 24
  1. Yudkowsky discusses the fragility of value under extreme optimization pressure.
  2. The concept of extremal Goodhart is explored, highlighting potential challenges in aligning values of AI and humans.
  3. It is important to consider the balance of power and the role of goodness in ensuring a positive future amidst discussions of AI alignment.
Maximum Progress β€’ 196 implied HN points β€’ 06 Mar 23
  1. Humans can use incremental optimizations to train AI but changes in environment can lead to unpredictability in behavior.
  2. AI models can end up following heuristics that worked in training but are not aligned with the desired goal.
  3. Natural selection successfully deals with misalignment by constantly selecting and adapting organisms to new environments.
Get a weekly roundup of the best Substack posts, by hacker news affinity: