The hottest Machine Learning Substack posts right now

And their main takeaways

AI #124: Grokless Interlude

Don't Worry About the Vase • 1792 implied HN points • 10 Jul 25

🕹 Technology Machine Learning

Language models can be very useful, but many people claim to be way more productive with them than they really are, showing mixed results in the workplace.
Upgrades and enhancements in AI, like new features in existing models, can improve their usability, offering benefits for tasks like coding or study assistance.
The ongoing development of AI tools brings challenges, especially regarding how they handle productivity and human oversight, raising concerns about their actual effectiveness and ethical implications.

2025 Open Models Year in Review

Democratizing Automation • 292 implied HN points • 14 Dec 25

🕹 Technology Machine Learning

Open models made a dramatic jump in 2025, matching closed models on many benchmarks and becoming realistic options for real-world deployments beyond just privacy or fine-tuning.
A few breakout releases — notably DeepSeek R1, Qwen 3, and Kimi K2 — had outsized influence, driving wider adoption and encouraging more open licensing from major labs, especially in China.
The ecosystem exploded in scale and variety, with thousands of new models uploaded monthly, clear specialist niches and a public tiering of makers, leaving open models established and poised for further growth in 2026.

Why Industry Leaders Are Betting on Mutually Exclusive Futures

The Algorithmic Bridge • 318 implied HN points • 15 Dec 25

🕹 Technology Machine Learning

Two leading AI figures are pursuing opposite goals: one is focused on building and containing a possible future superintelligence, while the other is building practical tutor-like agents for today’s use cases.
Their stark disagreement, despite similar training and prestige, shows that even top experts don’t agree on AI’s ultimate path or timeline.
That deep uncertainty extends across industry, academia, and investors, producing fragmented, independent bets instead of a coordinated plan for the future.

How to succeed as a Machine Learning Engineer

The ML Engineer Insights • 359 implied HN points • 22 Jun 24

🕹 Technology Machine Learning

Building a strong foundation in machine learning fundamentals and staying updated with the latest research are crucial for success as a Machine Learning Engineer.
Playing to your strengths, such as data and feature engineering, modeling, and deployment scalability, is key. Seek help in areas where you're less experienced.
Focus on aligning your work with business goals, understanding trade-offs, ROI, and embracing experimentation. Continuous learning, networking, and mentorship are invaluable.

AGI isn’t coming in 2025, and GPT-5 probably isn’t either.

Marcus on AI • 4189 implied HN points • 09 Jan 25

🕹 Technology Machine Learning

AGI, or artificial general intelligence, is not expected to be developed by 2025. This means that machines won't be as smart as humans anytime soon.
The release of GPT-5, a new AI model, is also uncertain. Even experts aren't sure if it will be out this year.
There is a trend of people making overly optimistic predictions about AI. It's important to be realistic about what technology can achieve right now.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

LLMs – Part 3: Context Matters — Self Attention

Vasu’s Newsletter • 78 implied HN points • 25 Jan 26

🕹 Technology Machine Learning

Each token creates query, key, and value vectors so it can ask what it needs, match that against other tokens, and gather useful information.
Tokens compare their query to every key to get raw scores, convert those scores to attention weights with softmax, and use the weights to take a weighted sum of value vectors to produce a new contextual vector.
Self-attention makes token meanings contextual (helping with pronouns, disambiguation, and long-range links), and models use multiple attention heads plus feed-forward layers to capture different relation patterns and refine each token's representation.

AI Links, 12/21/2025

In My Tribe • 197 implied HN points • 21 Dec 25

🕹 Technology Machine Learning

AI can run many human-like interviews and assessments cheaply and reliably, letting organizations collect richer open-ended responses at scale.
Even when AI succeeds technically, the firms that build models might not capture the value—competition can erode profits and create financial risks even as enterprise usage and integration grow.
Whoever controls the data, algorithms, and coordination networks gains real decision-making power, and AI’s fast adaptability could outpace human retraining and reshape many jobs.

The new AI scaling law shell game

Marcus on AI • 4663 implied HN points • 24 Nov 24

🕹 Technology Machine Learning

Scaling laws in AI aren't as reliable as people once thought. They're more like general ideas that can change, rather than hard rules.
The new approach to scaling, which focuses on how long you train a model, can be costly and doesn't always work better for all problems.
Instead of just trying to make existing models bigger or longer-lasting, the field needs fresh ideas and innovations to improve AI.

Latest open artifacts (#17): NVIDIA, Arcee, Minimax, DeepSeek, Z.ai and others close an eventful year on a high note

Democratizing Automation • 150 implied HN points • 05 Jan 26

🕹 Technology Machine Learning

Several major open models and updates landed at year-end — releases from NVIDIA, Arcee, LLM360, Zhipu and others noticeably pushed open-model capabilities higher.
The community trend is toward bigger and Mixture-of-Experts (MoE) architectures, multi-token prediction, and openly releasing training data and checkpoints, which should speed progress and reproducibility.
Important tradeoffs remain: some models excel on specific tasks like UI or coding but can be slower or weaker on very long-context workloads, and even larger, more capable variants are promised in 2026.

2026 AI Predictions

Alex Ghiculescu's Newsletter • 135 implied HN points • 19 Jan 26

🕹 Technology Machine Learning

AI labs will focus on coding agents, with most development effort and revenue moving toward models that write software.
Keeping up with rapidly improving AI coding tools will be the main challenge for software companies; engineering teams will need to learn new workflows and roll them out across people with different skills and enthusiasm.
New techniques will close agents' domain-knowledge gaps so models can understand real codebases and make decisions, and those same solutions will boost many other AI applications.

20 Predictions for AI in 2026

The Algorithmic Bridge • 286 implied HN points • 12 Dec 25

🕹 Technology Machine Learning

A clear set of twenty specific predictions about how AI will develop in 2026 is presented.
The piece reviews results from 2025 predictions and commits to being more specific and accountable to improve forecasting accuracy.
Full access to the detailed content is behind a subscription paywall, though a 7-day free trial is offered.

The Sequence Radar #795: The New Inference Kids

TheSequence • 112 implied HN points • 25 Jan 26

🕹 Technology Machine Learning

Serving models (inference) is now the main battleground, drawing huge funding as startups race to make model serving boring, reliable, and infinitely scalable.
New kernel-level tricks are cutting recomputation and memory waste: RadixAttention reuses KV cache blocks like an LRU to avoid recomputing prefixes, and PagedAttention pages KV memory so GPUs can pack many more requests without VRAM fragmentation.
Latency and per-turn cost now define product quality, causing a split in the stack between orchestration/hardware layers that manage scale and kernel teams that squeeze every FLOP to make models fast and cheap.

AlphaGeometry2: Impressive accomplishment, but still a long path ahead

Marcus on AI • 3161 implied HN points • 17 Feb 25

🕹 Technology Machine Learning

AlphaGeometry2 is a specialized AI designed specifically for solving tough geometry problems, unlike general chatbots that tackle various types of questions. This means it's really good at what it was built for, but not much else.
The system's impressive 84% success rate comes with a catch: it only achieves this after converting problems into a special math format first. Without this initial help, the success rate drops significantly.
While AlphaGeometry2 shows promising advancements in AI problem-solving, it still struggles with many basic geometry concepts, highlighting that there's a long way to go before it can match high school students' understanding in geometry.

Levels of greatness

next big thing • 37 implied HN points • 12 Feb 26

🕹 Technology Machine Learning

Greatness exists in distinct layers, and the gap between each level can be enormous — someone who’s great at one level can be thoroughly outclassed by the next.
Many systems follow a power-law pattern where a tiny number of people, companies, or places capture most of the attention, wealth, or returns.
AI, especially models that can help build and improve themselves, is accelerating that concentration, so a small set of firms is likely to pull much farther ahead.

GPT Agent Is Standing By

Don't Worry About the Vase • 1299 implied HN points • 23 Jul 25

🕹 Technology Machine Learning

OpenAI's ChatGPT Agent can now perform tasks like managing your calendar or shopping for groceries. It uses a combination of web browsing, research skills, and conversational abilities to help users with more complex requests.
Although the ChatGPT Agent shows promise and can do some tasks well, like spreadsheet work, it still faces limitations. For now, it feels more like a helpful assistant rather than a full replacement for humans in many tasks.
Safety is a top priority with the new capabilities of the ChatGPT Agent. OpenAI is taking steps to prevent misuse and ensure that the technology is used responsibly, especially in sensitive areas like biology and chemistry.

o3, Oh My

Don't Worry About the Vase • 3852 implied HN points • 30 Dec 24

🕹 Technology Machine Learning

OpenAI's new model, o3, shows amazing improvements in reasoning and programming skills. It's so good that it ranks among the top competitive programmers in the world.
o3 scored impressively on challenging math and coding tests, outperforming previous models significantly. This suggests we might be witnessing a breakthrough in AI capabilities.
Despite these advances, o3 isn't classified as AGI yet. While it excels in certain areas, there are still tasks where it struggles, keeping it short of true general intelligence.

Data Science Weekly - Issue 557

Data Science Weekly Newsletter • 159 implied HN points • 25 Jul 24

🕹 Technology Machine Learning

AI models can break down when trained on data that is generated by other models. This can cause problems in how well they work.
There is scientific research about the history of Italian filled pasta. It shows that most types likely came from a single area in northern Italy.
There are new resources and guides available for improving predictive modeling with tabular data. These can help you build better models by focusing on how data is represented.

Code Clinic | Orchestrating Transformers Agents 2.0 for Internet Search

Encyclopedia Autonomica • 19 implied HN points • 09 Oct 24

🕹 Technology Machine Learning

Using Transformer Agents 2.0 is a step up from traditional methods. They can handle multi-step tasks better and have memory to store information as they work.
Setting up and building a basic ReAct Agent is straightforward. You only need to install some packages and create the agent using selected models and tools.
You can orchestrate multiple agents together for more complex tasks. By combining different agents, you can enhance their capabilities and improve the results of your searches or queries.

Don’t Ride This Bike! Generative AI’s persistent trouble with compositionality and parts

Marcus on AI • 3952 implied HN points • 08 Dec 24

🕹 Technology Machine Learning

Generative AI struggles with understanding complex relationships between objects in images. It sometimes produces physically impossible results or gets details wrong when asked to create images from text.
Recent improvements in AI models, like DALL-E3, show only slight progress in handling specifications related to parts of objects. It can still mislabel parts or fail to follow more complex requests.
AI systems need to improve their ability to check and confirm that generated images match the prompts given by users. This may require new technologies for better understanding between language and visuals.

From Theory to Practice: Inductive Biases in Machine Learning

Mindful Modeler • 639 implied HN points • 23 Apr 24

🕹 Technology Machine Learning

Different machine learning models exhibit varying behaviors when extrapolating features, influenced by their inductive biases.
Inductive biases in machine learning influence the learning algorithm's direction, excluding certain functions or preferring specific forms.
Understanding inductive biases can lead to more creative and data-friendly modeling practices in machine learning.

Data Science Weekly - Issue 530

Data Science Weekly Newsletter • 1418 implied HN points • 19 Jan 24

🕹 Technology Machine Learning

Good data visualization is important. Some types of graphs can be misleading, and it's better to avoid them.
In healthcare, it's not just about having advanced technology like AI. The real focus should be on getting effective results from these technologies.
Netflix released a lot of data about what people watched in 2023. Analyzing this can help us understand trends in streaming better.

DeepSeek Is Chinese But Its AI Models Are From Another Planet

The Algorithmic Bridge • 3344 implied HN points • 21 Jan 25

🕹 Technology Machine Learning

DeepSeek, a Chinese AI company, has quickly created competitive AI models that are open-source and cheap. This challenges the idea that the U.S. has a clear lead in AI technology.
Their new model, R1, is comparable to OpenAI's best models, showcasing that they can produce high-quality AI without the same resources. It suggests they might be using innovative methods to build these models efficiently.
DeepSeek’s approach also includes letting their model learn on its own without much human guidance, raising questions about what future AI could look like and how it might think differently than humans.

Meta Buys Manus: Shifting Currents at Meta

Enterprise AI Trends • 168 implied HN points • 30 Dec 25

🕹 Technology Machine Learning

Meta's acquisition of Manus rescues a fast-growing but unprofitable startup and rewards its founders and investors, while adding geopolitical and competitive implications.
Because Manus relied heavily on Anthropic's Claude, the deal creates strategic tension — Meta could replace Claude in Manus's agent loop and become a direct competitor to Anthropic.
The purchase highlights a bigger industry debate: Meta is betting that agent scaffolding and tools — not just foundational models — hold the most value, a stance that could reshape AI strategy and competition.

Statistical modeling seen through inductive biases

Mindful Modeler • 419 implied HN points • 28 May 24

🔬 Science Machine Learning

Statistical modeling involves modeling distributions and assuming relationships between features and the target with a few interpretable parameters.
Distributions shape the hypothesis space by restricting the range of models compatible with specific distributions like a zero-inflated Poisson distribution.
Parameterization in statistical modeling simplifies estimation, interpretation, and inference of model parameters by making them more interpretable and allowing for confidence intervals.

Something Big Is Happening in Cybersecurity

The Security Industry • 35 implied HN points • 17 Feb 26

🕹 Technology Machine Learning

AI development is accelerating fast, with new models that feel like a qualitative leap and are even being used to build the next generation of models.
The AI security market has exploded into hundreds of companies, including many focused on automating SOC work, and it has attracted substantial venture funding.
AI security is becoming a standard part of organizational defenses, and soon it will no longer make sense to treat it as a separate category because every vendor will have AI-driven security features.

~70% of PHerc. 172 is now digitally unwrapped

Vesuvius Challenge • 98 implied HN points • 13 Jan 26

🕹 Technology Machine Learning

The team has digitally unwrapped about 70% of the lower region of PHerc. 172 using a new automated pipeline that's over 10× faster than fully manual methods, though humans still must fix sheet‑switch errors.
The unwrapped area covers roughly 7 meters by 14 cm and gives semi‑continuous surfaces with readable ink mainly on outer wraps and fragments; the upper ~30% is too mangled to unwrap reliably and the 7.9 µm scan resolution limits legibility compared with clearer 2.4 µm rescans.
Help is needed to improve surface extraction (to reduce sheet switches), strengthen ink detection in hard inner regions, and make the pipeline more scalable and user‑friendly—there's an ongoing Kaggle challenge for surface detection.

2025 Interconnects year in review

Democratizing Automation • 195 implied HN points • 18 Dec 25

🕹 Technology Machine Learning

The publication grew a lot this year and became a much more influential source of cutting‑edge AI analysis, reaching millions of pageviews and a much larger audience.
Reinforcement learning, reasoning models, and open‑model ecosystems were the central technical themes, and major initiatives were launched to advance American open models and research infrastructure.
Output hit practical limits after a year of high volume, so the focus is shifting to higher‑value work: prioritizing quality over quantity, investing in key projects, and using more open models going forward.

The next big thing in 2026 will be...

next big thing • 141 implied HN points • 01 Jan 26

🕹 Technology Machine Learning

Autonomous, end-to-end AI agents will move from being copilots to pilots, owning whole workflows and delivering outcomes rather than just answering prompts.
Persistent memory, proactive behavior, and on-device inference will make AI feel like a personal companion and unlock a wave of new consumer products, generative media, and personalized experiences.
AI will start showing up in the bottom line, driving real deployments, new pricing models, hardware launches, and a surge of IPOs and M&A, while human-heavy AI services get exposed if they can’t prove machine-driven margins.

Nobody is shipping your agent’s code (yet) | Predictions from LinearB’s Ori Keren

Dev Interrupted • 56 implied HN points • 03 Feb 26

🕹 Technology Machine Learning

AI has erased the blank-page problem and speeds up code generation, but those upstream gains are being lost to chaotic code reviews, testing, and integration unless teams build proper infrastructure.
Agentic tools that can control your local machine (like OpenClaw/Moltbot) show huge power but create major security and governance risks, so most organizations won’t give them autonomous control yet.
The economics of software are shifting: survival favors substrate-efficient tools and firms with unique data or "insight compression," and the current "dark flow" of vibe coding can make teams feel faster while actually introducing hidden bugs, so risk-aware pipelines and better testing are essential.

Recursive self-improvement

Metacritic Capital • 6 implied HN points • 10 Mar 26

🕹 Technology Machine Learning

AI training and inference costs are falling rapidly, with practical community optimizations already cutting costs by large orders of magnitude.
Cheaper models let you run far more reasoning tokens, and that extra compute predictably improves performance; reinforcement learning with verifiable rewards can crystallize those gains.
Falling costs combined with inference-time scaling and agent swarms create a feedback loop that can drive recursive self-improvement, so investors should expect faster capability growth and significant economic and safety implications.

The Sequence Opinion #798: Inside the Most Important Paper in Agentic Reasoning: Why the Loop is the Logic

TheSequence • 84 implied HN points • 29 Jan 26

🕹 Technology Machine Learning

Reasoning comes from the interaction loop with the environment, not just from the model itself.
Current LLMs act like fast, shallow 'System 1' pattern matchers, so they need agentic feedback loops to produce real-world reasoning and agency.
The next frontier is designing the agentic loop and environment (the "new hidden layer") rather than only scaling model parameters.

Machine Learning From Zero, Chapter 01

The Palindrome • 4 implied HN points • 14 Mar 26

🕹 Technology Machine Learning

Machine learning means training predictive models from data. The core setup uses a dataset, a parametric model (a hypothesis), and a loss function to measure how well the model fits the data.
A model approximates the true input–output relation and depends on both its parameters and the training data (often written h(x; w, D)). Models can be deterministic or probabilistic and belong to different families like generative or discriminative.
Which learning paradigm you use depends on what inputs, outputs, and labels are available — the main paradigms are supervised, unsupervised, semi‑supervised, and reinforcement learning. In supervised learning you have input–label pairs and the goal is to learn the mapping from x to y.

Getting 50% (SoTA) on ARC-AGI with GPT-4o

Redwood Research blog • 285 HN points • 17 Jun 24

🕹 Technology Machine Learning

Achieving a 50% accuracy on the ARC-AGI dataset using GPT-4o involved generating a large number of Python programs and selecting the correct ones based on examples.
Key approaches included meticulous step-by-step reasoning prompts, revision of program implementations, and feature engineering for better grid representations.
Further improvements in performance were noted to be possible by increasing runtime compute, following clear scaling laws, and fine-tuning GPT models for better understanding of grid representations.

2024: Silicon Valley Tries to "Open-Source" AGI

AI Supremacy • 1257 implied HN points • 20 Jan 24

🕹 Technology Machine Learning

Silicon Valley aims to open-source AGI to benefit everyone.
Facebook and other companies are working on advancing AI technology.
There is a shift towards democratizing general intelligence through various AI devices like AR glasses.

The Sequence AI of the Week #797: The New Companies that can Change the Inference Landscape

TheSequence • 84 implied HN points • 28 Jan 26

🕹 Technology Machine Learning

Two new commercial companies from the vLLM and SGLang teams—Inferact and RadixArk—raised huge funding and are positioning themselves as major players in the inference stack.
The focus is shifting from building bigger models to improving inference unit economics, so the software that manages memory, scheduling, and kernels is now the main battleground.
Serving models efficiently is bottlenecked by scarce VRAM and the KV cache tax, because asynchronous and unpredictable inference patterns drive up cost and complexity.

The Sequence Opinion #806: The Emergence of the Agent-as-a-Judge: Why Evals Need a Reasoning Engine

TheSequence • 49 implied HN points • 12 Feb 26

🕹 Technology Machine Learning

Evaluation moved from informal "vibe checks" to using stronger LLMs to automatically grade weaker models' outputs.
That single-pass LLM-as-judge approach powered benchmarks like MT-Bench and Chatbot Arena, but simple intuitive judgments are becoming insufficient.
The field is shifting to agent-as-a-judge, where evaluations need multi-step reasoning engines and dynamic, agentic judging instead of static benchmarks.

4 Trillion Events Daily at LinkedIn

VuTrinh. • 319 implied HN points • 08 Jun 24

🕹 Technology Machine Learning

LinkedIn processes around 4 trillion events every day, using Apache Beam to unify their streaming and batch data processing. This helps them run pipelines more efficiently and save development time.
By switching to Apache Beam, LinkedIn significantly improved their performance metrics. For example, one pipeline's processing time went from over 7 hours to just 25 minutes.
Their anti-abuse systems became much faster with Beam, reducing the time taken to identify abusive actions from a day to just 5 minutes. This increase in efficiency greatly enhances user safety and experience.

Domain Transformations: The Art of Finding Easier Spaces

Software Bits Newsletter • 103 implied HN points • 05 Jan 26

🕹 Technology Machine Learning

Transform hard problems into easier ones by moving to a different domain, doing the simpler computation there, and (if needed) transforming the result back; this is worth it when the transform cost plus the easier computation is less than solving the original problem.
Use well-known transforms to fix numerical and computational issues: log-space turns tiny-product underflow into stable sums (use the log-sum-exp trick to add probabilities safely), Fourier turns convolution into cheap pointwise multiplication, and embeddings or kernels lift data so linear methods work.
Always check that a transform preserves what you need and that the round-trip cost is justified; the best algorithms exploit problem structure by finding the space where the computation becomes simple.

AI Agents: Exploring Agentic Applications

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 119 implied HN points • 29 Jul 24

🕹 Technology Machine Learning

Agentic applications are AI systems that can perform tasks and make decisions on their own, using advanced models. They can adapt their actions based on user input and the environment.
OpenAgents is a platform designed to help regular users interact with AI agents easily. It includes different types of agents for data analysis, web browsing, and integrating daily tools.
For these AI agents to work well, they need to be user-friendly, quick, and handle mistakes gracefully. This is important to ensure that everyone can use them, not just tech experts.

The hottest Machine Learning Substack posts right now

Don't Worry About the Vase • 1792 implied HN points • 10 Jul 25

Democratizing Automation • 292 implied HN points • 14 Dec 25

The Algorithmic Bridge • 318 implied HN points • 15 Dec 25

The ML Engineer Insights • 359 implied HN points • 22 Jun 24

Marcus on AI • 4189 implied HN points • 09 Jan 25

Vasu’s Newsletter • 78 implied HN points • 25 Jan 26

In My Tribe • 197 implied HN points • 21 Dec 25

Marcus on AI • 4663 implied HN points • 24 Nov 24

Democratizing Automation • 150 implied HN points • 05 Jan 26

Alex Ghiculescu's Newsletter • 135 implied HN points • 19 Jan 26

The Algorithmic Bridge • 286 implied HN points • 12 Dec 25

TheSequence • 112 implied HN points • 25 Jan 26

Marcus on AI • 3161 implied HN points • 17 Feb 25

next big thing • 37 implied HN points • 12 Feb 26

Don't Worry About the Vase • 1299 implied HN points • 23 Jul 25

Don't Worry About the Vase • 3852 implied HN points • 30 Dec 24

Data Science Weekly Newsletter • 159 implied HN points • 25 Jul 24

Encyclopedia Autonomica • 19 implied HN points • 09 Oct 24

Marcus on AI • 3952 implied HN points • 08 Dec 24

Mindful Modeler • 639 implied HN points • 23 Apr 24

Data Science Weekly Newsletter • 1418 implied HN points • 19 Jan 24

The Algorithmic Bridge • 3344 implied HN points • 21 Jan 25

Enterprise AI Trends • 168 implied HN points • 30 Dec 25

Mindful Modeler • 419 implied HN points • 28 May 24

The Security Industry • 35 implied HN points • 17 Feb 26

Graphs For Science • 105 implied HN points • 10 Jan 26

Vesuvius Challenge • 98 implied HN points • 13 Jan 26

Democratizing Automation • 195 implied HN points • 18 Dec 25

next big thing • 141 implied HN points • 01 Jan 26

Dev Interrupted • 56 implied HN points • 03 Feb 26

Metacritic Capital • 6 implied HN points • 10 Mar 26

TheSequence • 84 implied HN points • 29 Jan 26

The Palindrome • 4 implied HN points • 14 Mar 26

Redwood Research blog • 285 HN points • 17 Jun 24

AI Supremacy • 1257 implied HN points • 20 Jan 24

TheSequence • 84 implied HN points • 28 Jan 26

TheSequence • 49 implied HN points • 12 Feb 26

VuTrinh. • 319 implied HN points • 08 Jun 24

Software Bits Newsletter • 103 implied HN points • 05 Jan 26

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 119 implied HN points • 29 Jul 24