SemiAnalysis ⢠15456 implied HN points ⢠06 Jan 26
- Scaling reinforcement learning (postātraining) is the main engine of recent capability and utility gains, with labs pouring compute into RL and using broad realāworld evals like GDPval to measure progress.
- Building RL environments and datasets is a large, specialized industry ā firms clone UIs, create coding and software gyms, and hire domain experts to write tasks and rubrics, spawning many vendors and "RL as a service" offerings.
- Applying RL to science and biology requires closedāloop physical experiments and robotics, faces long costly rollouts and sparse rewards, and will push models and labs toward specialized, nonācommodified solutions.