SemiAnalysis • 15456 implied HN points • 06 Jan 26
- Scaling reinforcement learning (post‑training) is the main engine of recent capability and utility gains, with labs pouring compute into RL and using broad real‑world evals like GDPval to measure progress.
- Building RL environments and datasets is a large, specialized industry — firms clone UIs, create coding and software gyms, and hire domain experts to write tasks and rubrics, spawning many vendors and "RL as a service" offerings.
- Applying RL to science and biology requires closed‑loop physical experiments and robotics, faces long costly rollouts and sparse rewards, and will push models and labs toward specialized, non‑commodified solutions.