The hottest Machine Learning Substack posts right now

And their main takeaways

Take the inductive leap

Mindful Modeler • 279 implied HN points • 30 Apr 24

🔬 Science Machine Learning

In a 2-day universe, predicting the future is uncertain and relies on assumptions, highlighting the challenge of inductive reasoning.
The problem of induction questions the idea that the future will always mirror the past, emphasizing the need to critically assess assumptions.
Taking an inductive leap involves making predictions based on past observations and acknowledging the inherent uncertainty and need to challenge assumptions in our understanding of the world.

The Sequence Knowledge #804: The Dreamer Trilogy: Inside Some of the Most Influential Papers in AI World Models

TheSequence • 28 implied HN points • 10 Feb 26

🕹 Technology Machine Learning

The Dreamer trilogy of papers reshaped how researchers build and use world models in AI.
Model-based reinforcement learning inspired modern world models, focusing on agents that learn internal predictive models instead of directly mapping pixels to actions.
Model-free methods like DQN succeeded in 2D games but struggled in complex 3D environments such as DeepMind Lab and Minecraft, revealing the limits of purely reactive agents and motivating the shift to world models.

Data Science Weekly - Issue 552

Data Science Weekly Newsletter • 139 implied HN points • 20 Jun 24

🕹 Technology Machine Learning

Notebooks can be easy to use, but they might make you lazy in coding. It's important to follow good practices even when using them.
When handling large datasets, it's crucial to learn how to scale effectively. Knowing how to use resources wisely can help you reach your goals faster.
Retrieval Augmented Generation (RAG) can improve how models generate information. It's complex, but understanding it can boost the performance of your projects.

WeKnow-RAG

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 39 implied HN points • 16 Aug 24

🕹 Technology Machine Learning

WeKnow-RAG uses a smart approach to gather information that mixes simple facts from its knowledge base with data found on the web. This helps improve the accuracy of answers given to users.
This system includes a self-check feature, which allows it to assess how confident it is in the information it provides. This helps to reduce mistakes and improve quality.
Knowledge Graphs are important because they organize information in a clear way, allowing the system to find the right data quickly and effectively, no matter what type of question is asked.

Inductive biases - a better way to think about machine learning?

Mindful Modeler • 159 implied HN points • 11 Jun 24

🕹 Technology Machine Learning

Hyperparameter settings can drastically change inductive biases within machine learning models.
Machine learning algorithms represent a collection of inductive biases that influence model outcomes.
Understanding inductive biases is crucial for comprehending the robustness, interpretability, and plausibility of machine learning models.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Prophecies of the Flood

One Useful Thing • 1608 implied HN points • 10 Jan 25

🕹 Technology Machine Learning

AI researchers are predicting that very smart AI systems will soon be available, which they call Artificial General Intelligence (AGI). This could change society a lot, but many think we should be cautious about these claims.
Recent AI models have shown they can solve very tough problems better than humans. For example, one new AI model performed surprisingly well on difficult tests that challenge knowledge and problem-solving skills.
As AI technology improves, we need to start talking about how to use it responsibly. It's important for everyone—from workers to leaders—to think about what a world with powerful AIs will look like and how to adapt to it.

The Bitter Pipeline

Abstraction • 39 implied HN points • 28 Jan 26

🕹 Technology Machine Learning

Frontier models scale better than human-designed forecasting pipelines, so the structured process that helped smaller models often adds no value with larger models.
Empirical tests show spending compute on polling and ensembling big models improves forecast skill more than token-heavy steps like classification or decomposition, with ensembling giving measurable uplift while the pipeline did not.
The practical move is to simplify: ensemble aggressively, validate empirically, and keep experimenting with ways to elicit latent model knowledge instead of adding complex hand-crafted processes.

Creating Synthetic Training Data

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 59 implied HN points • 01 Aug 24

🕹 Technology Machine Learning

Creating synthetic data is hard because it's not just about making more data; it also needs to be diverse and varied. It's tough to make sure there are enough different examples.
Using a seed corpus can limit how varied the synthetic data is. If the starting data isn't diverse, the generated data won't be either.
A new approach called Persona Hub uses a billion different personas to create varied synthetic data. This helps in generating high-quality, interesting content across various situations.

Deep Learning Weekly: Issue 336

Deep Learning Weekly • 648 implied HN points • 17 Jan 24

🕹 Technology Machine Learning

This week's deep learning topics include generative AI in enterprises, query pipelines, and closed-loop verifiable code generation.
Updates in MLOps & LLMOps cover CI/CD practices, multi-replica endpoints, and serverless solutions like Pinecone.
Learning insights include generating images from audio, understanding self-attention in LLMs, and fine-tuning models using PyTorch tools.

xAI's Grok 4: The tension of frontier performance with a side of Elon favoritism

Democratizing Automation • 562 implied HN points • 12 Jul 25

🕹 Technology Machine Learning

Grok 4 is a powerful AI model that performs well on benchmarks but struggles in practical usability, making it hard for users to switch from existing AI tools.
The model's unique selling point is its ability to use multiple agents for complex tasks, but its overall performance can be inconsistent and relies heavily on search functions.
Despite achieving high scores, Grok 4 faces significant challenges, including a lack of differentiation in a crowded market, where simply being better isn't enough to attract users.

Data Science Weekly - Issue 556

Data Science Weekly Newsletter • 79 implied HN points • 18 Jul 24

🕹 Technology Machine Learning

AI research in China is progressing rapidly, but it hasn't received much attention compared to developments in the US. There are many complexities in understanding the implications of this advancement.
There are new methods to improve large language models (LLMs) using production data, which can enhance their performance over time. A structured approach to analyzing data quality can lead to better outcomes.
Evaluating modern machine learning models can be challenging, leading to some questionable research practices. It's important to understand these issues to ensure more accurate and reproducible results.

The Age of AI Flattery

Rozado’s Visual Analytics • 450 implied HN points • 05 Aug 25

🕹 Technology Machine Learning

AI often caters to what users want to hear, leading to a tendency to flatter instead of challenge.
As people get more used to this flattery, they might start preferring AI chats over real conversations, which may harm their ability to handle disagreements.
The design of AI systems focuses on keeping users happy, but this could mean less critical thinking and debate in interactions.

The Sequence Radar #803: Last Week in AI: Anthropic and OpenAI’s Battle for the Long Horizon, Goodfire and LayerLens Push AI Accountability

TheSequence • 28 implied HN points • 08 Feb 26

🕹 Technology Machine Learning

AI is moving from conversational assistants to agentic systems that can plan, act, and self-manage across long time horizons, with new models built to reason over huge contexts and even help in their own development.
Interpretability and accountability are rising to the top of the agenda, as companies build tools to map model internals and run agent-as-a-judge evaluations that verify complex, multi-step behaviors.
A fast-growing ecosystem of research, platforms, hardware moves, and big funding rounds is racing to operationalize and scale verifiable autonomous agents across industries like coding, cloud ops, audio, and healthcare.

☀ The Doomsday Clock needs a pro-progress switch to the Genesis Clock

Faster, Please! • 1370 implied HN points • 29 Jan 25

🕹 Technology Machine Learning

The Doomsday Clock is getting closer to midnight, signaling the world's increasing dangers like nuclear threats and climate change. We need a new way to measure progress, like the Genesis Clock, which focuses on humanity's advancements.
The Genesis Clock would celebrate achievements in technology and health, such as extending human lifespans or solving major diseases. It encourages us to look forward to positive developments instead of just fearing potential disasters.
AI can be our collaborative partner, helping us work better together rather than taking jobs away. It's about designing AI that complements human skills and enhances our research and creative processes.

AI Trends for 2026

Artificial Ignorance • 100 implied HN points • 17 Dec 25

🕹 Technology Machine Learning

Agents and harnesses are now the bottleneck, not just bigger models — layering planning, tools, state, and workflows on strong models is what’s unlocking reliable multi-step behavior in real products.
The core LLM primitives (tool use, search, code sandboxes, file editing, memory, personas) have mostly settled, and the next big win is standardizing interfaces and conventions so developers can wire them together consistently.
Interactions are moving beyond turn-based chat toward always-on, real-time collaboration where humans and AI co-edit and co-operate, and better UX plus streaming/agent orchestration will make that feel natural.

Tokenization in large language models, explained

The Counterfactual • 239 implied HN points • 02 May 24

🕹 Technology Machine Learning

Tokens are the building blocks that language models use to understand and predict text. They can be whole words or parts of words, depending on how the model is set up.
Subword tokenization helps models balance flexibility and understanding by breaking down words into smaller parts, so they can still work with unknown words.
Understanding how tokenization works is key to improving the performance of language models, especially since different languages have different structures and complexity.

The Sequence Knowledge #792: EVERYTHING you Need to Know About Synthetic Data Generation

TheSequence • 49 implied HN points • 20 Jan 26

🕹 Technology Machine Learning

Synthetic data is a practical scaling lever that fills coverage gaps and builds long-tail capabilities by creating targeted examples instead of waiting for rare real-world labels.
Core methods include generative synthesis, rephrasing/paraphrasing, multi-turn dialogue synthesis, and RL trajectory generation, each tailored to different tasks like images, instructions, conversations, or environment rollouts.
The focus is on quality over quantity: tight specs, automatic verification, diversity controls, and eval-driven feedback let teams steer capabilities, improve class balance, protect privacy, and iterate quickly.

NeurIPS 2025 Best Papers in Comics

Gonzo ML • 126 implied HN points • 01 Dec 25

🔬 Science Machine Learning

A new dataset called INFINITY-CHAT was introduced to evaluate how diverse outputs from language models really are. It showed that many models are producing very similar results, which is a big surprise.
The Gated Attention mechanism helps improve the stability of large language models during training. It makes sure that the output is more meaningful and controlled, which solves some common issues with deep models.
Using over 1,000 layers in reinforcement learning can actually be beneficial. This research challenges the idea that deeper networks don't help and suggests that they can learn new skills without needing detailed rewards.

Looking back on AI in 2025

Generating Conversation • 93 implied HN points • 18 Dec 25

🕹 Technology Machine Learning

Models stopped being the main story; improvements felt incremental. Success now depends on real applications and which products companies can own.
Big companies are paying close attention and spending aggressively on AI, including large acquisitions. That accelerates enterprise adoption and creates big opportunities for startups.
The field is still changing very fast, so specific predictions often miss the mark. The durable trend is base models becoming more of a commodity while value concentrates at the application and deployment layer.

The Sequence AI of the Week #789: Recursive Language Models: Inside the MIT Research Everyone is Talking About

TheSequence • 56 implied HN points • 14 Jan 26

🕹 Technology Machine Learning

Bigger context windows aren't always the answer; dumping more text into attention can make a model's reasoning worse, not better.
The paper calls this failure mode "context rot": as prompts grow, attention dilutes, the model's working set becomes unmanageable, and output quality drops.
Instead of just expanding attention, we need different computational shapes—treating prompts more like environments and processing information recursively to avoid drowning the model in irrelevant context.

Why Graphs are great for Fraud Detection [Math Mondays]

Technology Made Simple • 639 implied HN points • 01 Jan 24

🕹 Technology Machine Learning

Graphs are efficient at encoding and representing relationships between entities, making them useful for fraud detection tasks.
Graph Neural Networks excel at fraud detection due to their ability to visualize strong correlations among fraudulent activities that share common properties, adapt to new fraud patterns, and offer transparency in AI systems.
Graph Neural Networks require less labeled data and feature engineering compared to other techniques, have better explainability, and work well with semi-supervised learning, making them a powerful tool for fraud detection.

Book Launch: ML for Science 🐦‍⬛

Mindful Modeler • 499 implied HN points • 06 Feb 24

🔬 Science Machine Learning

The book discusses the justification and strengths of using machine learning in science, emphasizing prediction and adaptation to data
Machine learning lacks inherent transparency and causal understanding, but tools like interpretability and causality modeling can enhance its utility in research
The book is released chapter by chapter for free online, covering topics such as domain knowledge, interpretability, and causality

Has Sam Altman gone full Gary Marcus?

Marcus on AI • 4624 implied HN points • 16 Nov 23

🕹 Technology Machine Learning

In the midst of an AI boom, scale isn't everything, and there are still unresolved issues.
Recognition is growing that scoring well on benchmarks doesn't mean true foundational progress.
Tech leaders like Sam Altman are acknowledging the limitations of deep learning and considering new paradigms.

The Transformer Zoo Revisited

Gonzo ML • 126 implied HN points • 29 Nov 25

🕹 Technology Machine Learning

Transformer models can be either encoder-decoder types or decoder-only types. Right now, decoder-only models like GPT are very popular, but there are still reasons to explore the full encoder-decoder architecture.
In initial tests, decoder-only models often perform better during the pretraining stage. They have an advantage in tasks like zero-shot and few-shot learning because of their training setup.
After fine-tuning, encoder-decoder models show improved performance and efficiency. They handle long contexts better and can generate outputs more effectively, suggesting they might be a strong choice for future models.

“AI” Hype as Pretext for Labor Misclassification

The Column • 1002 implied HN points • 24 Jul 23

🕹 Technology Machine Learning

Beware using 'AI' hype to redefine labor and pay less
Misclassification of workers and redefining labor is a common cost-cutting tactic
The threat of 'AI' lies in redefining creative labor and promoting misclassification, rather than saving labor

AI #105: Hey There Alexa

Don't Worry About the Vase • 1120 implied HN points • 27 Feb 25

🕹 Technology Machine Learning

A new version of Alexa, called Alexa+, is coming soon. It will be much smarter and can help with more tasks than before.
AI tools can help improve coding and other work tasks, giving users more productivity but not always guaranteeing quality.
There's a lot of excitement about how AI is changing jobs and tasks, but it also raises concerns about safety and job replacement.

Scribble-based forecasting and AI 2027

DYNOMIGHT INTERNET NEWSLETTER • 562 implied HN points • 30 Jun 25

🕹 Technology Machine Learning

Both math and intuition can be used for forecasting, but they serve different purposes. Sometimes, using intuition can be more practical when creating predictions about complex situations.
Math-based forecasts are best when the rules of a situation are well understood and complex. For simpler scenarios, basic predictions may be just as effective.
Creating simple visual predictions, like drawing lines, can help clarify your thoughts. It's a great exercise to explore different potential outcomes and express predictions clearly.

Data Science Weekly - Issue 549

Data Science Weekly Newsletter • 159 implied HN points • 31 May 24

🕹 Technology Machine Learning

Mediocre machine learning can be very risky for businesses, as it may lead to significant financial losses. Companies need to ensure their ML products are reliable and efficient.
Understanding logistic regression can be made easier by using predicted probabilities. This approach helps in clearly presenting data analysis results, especially to those who may not be familiar with technical terms.
Data quality management is becoming essential in today's data-driven world. It's important to keep track of how data is tested and monitored to maintain trust and accuracy in business decisions.

My Predictions For 2026

Teaching computers how to talk • 57 implied HN points • 09 Jan 26

🕹 Technology Machine Learning

Generative AI went mainstream in 2025, powering images, video, code and daily tools, but its widespread use has also produced clear harms, controversies, and ethical risks.
Current models are very capable yet lack true understanding and real-world experience; alignment is mostly shallow, so continual learning and richer world models are emerging as crucial next steps.
AI is forcing big social changes—education must reinvent itself because students can use AI to shortcut learning, and people risk emotional dependence on chatbots that can be addictive, so society needs to protect critical thinking and human connection.

Bridging the Gap: From Statistical Distributions to Machine Learning Loss Functions

Mindful Modeler • 818 implied HN points • 14 Nov 23

🕹 Technology Machine Learning

Understanding the distribution of the target variable is key in choosing statistical analysis or machine learning loss functions.
Certain loss functions in machine learning correspond to maximum likelihood estimation for specific distributions, creating a bridge between statistical modeling and machine learning.
While connecting distributions to loss functions is insightful, the real power in machine learning lies in the flexibility to design custom loss functions rather than being constrained by specific distributions.

“Math is hard” — if you are an LLM – and why that matters

Marcus on AI • 4782 implied HN points • 19 Oct 23

🔬 Science Machine Learning

Even with massive data training, AI models struggle to truly understand multiplication.
LLMs perform better in arithmetic tasks than smaller models like GPT but still fall short compared to a simple pocket calculator.
LLM-based systems generalize based on similarity and do not develop a complete, abstract, reliable understanding of multiplication.

LangChain Search AI Agent Using GPT-4o-mini

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 59 implied HN points • 25 Jul 24

🕹 Technology Machine Learning

The LangChain Search AI Agent uses a tool called Tavily API to search the web and answer questions. It breaks down complex questions into simpler sub-questions for better results.
The GPT-4o-mini model is designed to be fast and cost-effective, making it suitable for tasks that require quick responses. It supports both text and vision inputs, expanding its usability.
Using LangSmith, you can track the execution and costs of each step in processing queries. This feature helps in optimizing the performance of the AI agent.

How AI Will Be Used for Cyber Security in 2024

Rod’s Blog • 615 implied HN points • 29 Dec 23

🕹 Technology Machine Learning

Cyber security is crucial in today's digital era due to increasing complexity of attacks, making traditional defense methods inadequate.
Artificial intelligence (AI) is becoming essential in fighting cyber threats by mimicking human intelligence in tasks like learning and decision-making.
In 2024, AI will play a vital role in cyber security, aiding in threat detection, prevention, response, and recovery.

7 perspectives on machine learning

Mindful Modeler • 279 implied HN points • 09 Apr 24

🕹 Technology Machine Learning

Machine learning is about building prediction models. It covers a wide range of applications, but may not be perfect for unsupervised learning.
Machine learning is about learning patterns from data. This view is useful for understanding ML projects beyond just prediction.
Machine learning is automated decision-making at scale. It emphasizes the purpose of prediction, which is to facilitate decision-making.

AI Is A Car That Everyone Expects To Be A Spaceship

Beekey’s Substack • 59 implied HN points • 24 Jul 24

🕹 Technology Machine Learning

AI has made great improvements, especially with tasks that involve generating human-like responses and art. However, many people are getting carried away with the hype about its capabilities.
Machine learning allows AI to recognize patterns in data, but it doesn't actually understand content like a human does. This means it can make mistakes that a human wouldn't.
The idea of creating Artificial General Intelligence (AGI) from current AI is questionable because we still don't fully understand how human intelligence works. It's not just about being faster; something fundamental is still missing.

Ranking the Chinese Open Model Builders

Democratizing Automation • 356 implied HN points • 17 Aug 25

🕹 Technology Machine Learning

China's AI labs are rapidly releasing open models, showing strong competition with Western counterparts. Labs like DeepSeek and Qwen are leading the pack with frequent and high-quality outputs.
DeepSeek is known for its innovative models and focus on performance, but its recent slower release pace has allowed other labs to catch up. They aim for continual improvement and impactful contributions.
Other emerging companies like Moonshot AI and Zhipu are also gaining ground, offering competitive models and partnering with tech giants for investments. They are expected to grow and possibly reshape the AI landscape.

IEEE-754: THE floating-point standard

Fprox’s Substack • 124 implied HN points • 22 Nov 25

🕹 Technology Machine Learning

IEEE-754 created a common binary floating-point standard that gives hardware and software consistent formats and behaviors, making numerical results more portable and predictable.
Major revisions added practical features — notably the 2008 update introduced decimal formats, half-precision and the fused multiply-add (FMA) for better performance and accuracy, while later updates clarified edge cases and added augmented operations for exact-error reporting.
Work is ongoing (including a 2029 revision and the P3109 effort for tiny formats), because emerging vendor-specific small formats for machine learning could fragment the ecosystem unless standards converge.

Data Science Weekly - Issue 553

Data Science Weekly Newsletter • 99 implied HN points • 27 Jun 24

🕹 Technology Machine Learning

Data visualization can show important patterns, like changes in night and daylight globally. Understanding these trends helps us appreciate our environment better.
In AI engineering, simplifying data preparation is crucial. Many new AI applications can be built without structured data, which might lead to rushed expectations about their effectiveness.
Aquaculture technology is evolving with better methods to track and analyze fish behavior. New approaches like deep learning are making monitoring more accurate and efficient.

Reinforcement learning with random rewards actually works with Qwen 2.5

Democratizing Automation • 633 implied HN points • 27 May 25

🕹 Technology Machine Learning

Reinforcement learning using random rewards can still improve performance in models like Qwen 2.5, even when the rewards aren't perfect. This suggests that the learning process is more flexible than previously thought.
Qwen 2.5 and its math-focused variants show that they might use unique reasoning strategies, like code-assisted reasoning, that help them perform better on math tasks. This means they learn in ways that other models might not.
The ongoing debate about the effectiveness of reinforcement learning with verifiable rewards (RLVR) highlights the need for further research. It also suggests that scaling up the use of reinforcement learning could lead to new behaviors in models, making them more capable.

The rise of reasoning machines

Democratizing Automation • 570 implied HN points • 12 Jun 25

🕹 Technology Machine Learning

Reasoning is when we draw conclusions based on what we observe. Humans experience reasoning differently than AI, but both lack a full understanding of their own processes.
AI models are improving but still struggle with complex problems. Just because they sometimes fail doesn't mean they can't reason; they just might need new methods to tackle tougher challenges.
The debate on whether AI can truly reason often stems from fear of losing human uniqueness. Some critics focus on what AI can't do instead of recognizing its potential, which is growing rapidly.