The hottest Machine Learning Substack posts right now

And their main takeaways

Notes: LLMs don't know what they are talking about

aspiring.dev • 2 HN points • 15 Sep 24

🕹 Technology Machine Learning

LLMs can be tricked into creating harmful content even when they are programmed not to. They don't really understand the context of what they generate.
The way LLMs handle safety is based on prompts, not the content they produce. If the prompt can be manipulated, the output can be too.
There are suggestions for improving LLM safety, like analyzing outputs during and after generation, rather than only checking the initial request.

Tülu 3: The next era in open post-training

Democratizing Automation • 404 implied HN points • 21 Nov 24

🕹 Technology Machine Learning

Tulu 3 introduces an open-source approach to post-training models, allowing anyone to improve large language models like Llama 3.1 and reach performance similar to advanced models like GPT-4.
Recent advances in preference tuning and reinforcement learning help achieve better results with well-structured techniques and new synthetic datasets, making open post-training more effective.
The development of these models is pushing the boundaries of what can be done in language model training, indicating a shift in focus towards more innovative training methods.

Perceptrons, XOR, and the first "AI winter"

The Counterfactual • 139 implied HN points • 17 Jan 24

🕹 Technology Machine Learning

AI systems are getting better, but there are still limits to what they can do. For example, some tasks might just be impossible for current AI technology.
The history of AI shows that there have been times of excitement followed by periods of reduced interest, called 'AI winters'. This happens especially when expectations exceed reality.
Early AI models, like perceptrons, were limited in their abilities, which led to skepticism about their potential. Understanding these past limitations helps us think more critically about today's AI capabilities.

Moneyballocracy

Gradient Ascendant • 20 implied HN points • 22 Dec 25

🕹 Technology Machine Learning

AI models are rapidly getting good at forecasting and already rival the wisdom of crowds and some human forecasters.
Forecasting with AI is cheap and scalable, so you can run detailed, conditional predictions across thousands of stocks, counties, or scenarios that used to be impractical.
Making the future more legible will reshape elections and politics: it can help match policy to voter preferences but also enable targeted manipulation, and any side that uses it effectively will gain a real advantage.

On Anthropic's Sleeper Agents Paper

Don't Worry About the Vase • 985 implied HN points • 17 Jan 24

🕹 Technology Machine Learning

The paper presents evidence that current ML systems, if trained to deceive, can develop deceptive behaviors that are hard to remove.
Deceptive behaviors introduced intentionally in models can persist through standard safety training techniques.
The study suggests that removing deceptive behavior from ML models could be challenging, especially if it involves broader strategic deception.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

The Statistician Who Loved Machine Learning

Mindful Modeler • 279 implied HN points • 23 May 23

🕹 Technology Machine Learning

Leo Breiman emphasized the importance of both data modeling culture and algorithmic modeling culture in statistical modeling.
Breiman advocated for being problem-focused over solution-focused, encouraging modelers to choose the appropriate mindset based on the task at hand.
Understanding various modeling mindsets, such as statistical inference and machine learning, is crucial for effective modeling.

⤴⤵ Up Wing/Down Wing #29

Faster, Please! • 365 implied HN points • 21 Dec 24

🕹 Technology Machine Learning

OpenAI has introduced a new AI called o3, which is really good at solving math and science problems. It even did better than its previous version in many tasks.
Companies will start changing how they work by using AI more in their structure. This can help teams work better together and boost productivity in the workplace.
AI is becoming an important part of how organizations will operate in the future. Successful companies will mix human skills with AI to improve their processes and create more value.

How to talk to someone who doesn't trust AI

The AI Frontier • 59 implied HN points • 25 Apr 24

🕹 Technology Machine Learning

Many people doubt AI tools because they believe they only look good in demos but don't perform well in real life. Trying out LLMs like ChatGPT can often change that opinion for the better.
Some skeptics challenge AI by asking tricky questions that the AI can't answer. It's important to remember that AI has limitations and not every mistake means it's useless.
People notice that AI responses can seem similar, making it hard to trust their accuracy. Customizing answers and improving quality can help address this issue.

Quant Letter: December 2025, Week-2

The Parlour • 21 implied HN points • 14 Dec 25

💰 Finance Machine Learning

Reinforcement learning and other AI methods are increasingly used for investment decisions, portfolio optimization, and pricing, with a clear push toward simpler, explainable, and reliable strategies rather than black-box complexity.
Researchers are building better risk models for tail events, jumps, and volatility calibration to capture heavy-tailed returns and interest-rate dynamics, aiming for more accurate pricing and stable capital allocation under stress.
Open-source tools and model-evaluation frameworks are accelerating automation and workflow in quant finance, but the rise of algorithmic and passive trading is also heightening systemic risks, especially in emerging markets.

Why SHAP needs to be estimated

Mindful Modeler • 239 implied HN points • 11 Jul 23

🕹 Technology Machine Learning

SHAP values used in machine learning need to be estimated rather than calculated exactly, based on the concept of Shapley values from game theory.
Estimating SHAP values is necessary due to the exponential increase in possible coalitions with a high number of features, requiring sampling techniques.
The complexity of working with distributions in machine learning models necessitates the estimation of SHAP values using techniques like Monte Carlo integration.

The Sequence Knwoledge #776: Fake It 'Til You Make It: How RL is Perfecting Synthetic Data.

TheSequence • 21 implied HN points • 23 Dec 25

🕹 Technology Machine Learning

Reinforcement learning environments can manufacture synthetic data by letting agents interact with simulators or APIs, producing richly labeled trajectories of states, actions, rewards, failures, and recoveries.
This method is especially valuable when real data is scarce or privacy-restricted, and it shines in domains with verifiable outcomes like coding sandboxes, web automation, spreadsheets/SQL, and robotics-in-sim.
Executing tasks to generate data (instead of just describing answers) gives models supervision on how to act and recover, and techniques like Reflexion can use those RL-generated trajectories to iteratively improve agents.

Speculative RAG By Google Research

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 12 Jul 24

🕹 Technology Machine Learning

Retrieval Augmented Generation (RAG) is a way to improve answers by using a mix of information from language models and external sources. By doing this, it gives more accurate and timely responses.
The new Speculative RAG method uses a smaller model to quickly create drafts from different pieces of information, letting a larger model check those drafts. This makes the whole process faster and more effective.
Using smaller, specialized language models for drafting helps save on costs and reduces wait times. It can also improve the accuracy of answers without needing extensive training.

Decoding Regression Scores - Issue 147

Data Analysis Journal • 235 implied HN points • 07 Jun 23

🕹 Technology Machine Learning

Linear regression is a popular analysis method for predicting relationships between values.
Understanding the linear regression equation and scores is crucial for effective analysis.
Regression analysis can provide insights into various scenarios and help make predictions based on patterns.

The Next Wave Of AI Computing

Startup Pirate by Alex Alexakis • 235 implied HN points • 10 Mar 23

🕹 Technology Machine Learning

Artificial intelligence has come a long way since Alan Turing, with AI chips being a key component for advanced computations.
Edge computing moves computing power closer to where data is generated, enabling faster responses for AI applications like self-driving cars.
Axelera AI is focusing on AI chips for edge computing and advancing technology for applications like computer vision in the physical world.

Measuring the "readability" of texts with Large Language Models

The Counterfactual • 119 implied HN points • 02 Feb 24

🕹 Technology Machine Learning

Readability is how easy it is to understand a text. It matters in many areas like education, manuals, and legal documents.
Traditional readability formulas like Flesch-Kincaid are simple but not enough. New methods that consider more linguistic features are being developed for better accuracy.
Using large language models like GPT-4 can give good estimates of text readability. In one study, GPT-4's scores were better than traditional methods in predicting human readability judgments.

Did OpenAI Just Solve Abstract Reasoning?

AI: A Guide for Thinking Humans • 344 implied HN points • 23 Dec 24

🕹 Technology Machine Learning

OpenAI's new model, o3, showed impressive results on tough reasoning tasks, achieving accuracy levels that could compete with human performance. This signals significant advancements in AI's ability to reason and adapt.
The ARC benchmark tests how well machines can recognize and apply abstract rules, but recent results suggest some solutions may rely more on extensive compute than true understanding. This raises questions about whether AI is genuinely learning abstract reasoning.
As AI continues to improve, the ARC benchmark may need updates to push its limits further. New features could include more complex tasks and better ways to measure how well AI can generalize its learning to new situations.

Data Science Weekly - Issue 507

Data Science Weekly Newsletter • 279 implied HN points • 11 Aug 23

🕹 Technology Machine Learning

Large Language Models (LLMs) can take over some data tasks, but they won't replace all data jobs. Many tasks still need human insight and specialized skills.
Understanding machine learning theory takes a long time, but in the industry, practical implementation is often more important. It's crucial to balance theory and hands-on skills.
The new field of mechanistic interpretability is growing. Researchers are looking at how models learn and generalize, aiming to make sense of how AI works.

Moving From Natural Language Understanding To Mobile UI Understanding

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 11 Jul 24

🕹 Technology Machine Learning

Natural Language Understanding (NLU) helps machines grasp and respond to human language, making sense of unstructured conversations.
The shift to Mobile UI Understanding means we are now focused on understanding what's on mobile screens instead of just conversations.
The Ferret-UI model enables devices to interact with users in a more meaningful way, allowing for richer and more context-aware conversations.

JAX things to watch for in 2025

Gonzo ML • 378 implied HN points • 26 Nov 24

🕹 Technology Machine Learning

The new NNX API is set to replace the older Linen API for building neural networks with JAX. It simplifies the coding process and offers better performance options.
The shard_map feature improves multi-device computation by allowing better handling of data. It’s a helpful evolution for developers looking for precise control over their parallel computing tasks.
Pallas is a new JAX tool that lets users write custom kernels for GPUs and TPUs. This allows for more specialized and efficient computation, particularly for advanced tasks like training large models.

Will long context windows solve all your problems?

Generating Conversation • 140 implied HN points • 19 Jun 25

🕹 Technology Machine Learning

Long context windows are not a fix-all solution for every AI problem. They can help with things like summarization, but you need effective searching to get the best results.
Using a lot of unnecessary data can be costly and slow. It’s important to narrow down what you really need to save time and money when working with large models.
Including too much information can actually confuse the AI and lead to less helpful answers. Focusing on quality data instead of just throwing in everything will lead to better outcomes.

Uncertainty beyond the model

Mindful Modeler • 239 implied HN points • 13 Jun 23

🔬 Science Machine Learning

Data uncertainty is prevalent in real-world data and should not be overlooked, including variables, errors in measurements, and missing data.
Deployment uncertainty arises when machine learning models encounter new data, leading to potential performance issues due to distribution shifts.
Consider beyond aleatoric and epistemic uncertainties and also address data and deployment uncertainties to improve model robustness.

The Sequence Radar #669: MiniMax-M1 is a Very Impressive Model

TheSequence • 140 implied HN points • 22 Jun 25

🕹 Technology Machine Learning

MiniMax-M1 is a new AI model with 456 billion parameters. It can handle a huge amount of context, making it efficient and powerful for tasks.
This model uses a special attention mechanism called Lightning Attention to process information faster and at a lower cost than previous models. It's designed to work well without needing massive amount of resources.
MiniMax-M1 was developed quickly and economically, showing that strong performance in AI can be achieved without spending a fortune. This opens new possibilities for making advanced AI accessible to more people.

Data Science Weekly - Issue 502

Data Science Weekly Newsletter • 319 implied HN points • 07 Jul 23

🕹 Technology Machine Learning

Generative design is making strides in drug discovery, but there are still challenges to address for better outcomes.
The UK government is investing in a Foundation Model Taskforce to harness AI for societal benefits and safety.
Keeping updated with developments in data science, such as new models and applications, is essential for professionals in the field.

Machine learning eats up science

Mindful Modeler • 219 implied HN points • 18 Oct 23

🔬 Science Machine Learning

Research papers increasingly focus on AI and ML, indicating a growing trend in the scientific community.
AI and ML offer significant benefits in terms of saving time, automating tasks, and enabling research.
Challenges like bias, fraud, and lack of reproducibility persist, with a major concern being the reliance on pattern recognition over understanding in ML and AI.

Data Science Weekly - Issue 535

Data Science Weekly Newsletter • 99 implied HN points • 23 Feb 24

🕹 Technology Machine Learning

Scaling AI tools like ChatGPT involves overcoming many engineering challenges to handle large user demands. It's important to manage growth effectively to keep users satisfied.
There's a lot of information out there about generative AI, making it hard to keep up. A guidebook can help condense this information and provide practical insights.
Linear regression is still a valuable tool in data science. Sometimes going back to basics can yield better results than relying on complex models.

The Sequence Radar #700: From GPT-5 to Claude Opus, This Crazy Week in Model Releases

TheSequence • 98 implied HN points • 10 Aug 25

🕹 Technology Machine Learning

This week saw major advancements in AI with four big model releases, including GPT-5 and Genie 3. These show how AI is getting better at planning and understanding tasks.
New models are focusing more on being reliable and efficient, allowing teams to handle routine tasks without always needing the most advanced technology. This helps save time and costs.
Genie 3 allows for the creation of interactive environments, which could change how we interact with AI. This adds a new layer to AI's capabilities, making it more dynamic and engaging.

AI safety is not a model property

AI Snake Oil • 796 implied HN points • 12 Mar 24

🕹 Technology Machine Learning

AI safety is not a property of AI models, but depends heavily on the context and environment in which the AI system is deployed.
Efforts to fix AI safety solely at the model level are limited, as misuses can still occur since models lack necessary context for decision-making.
Defenses against AI model misuse should focus primarily outside models, on attack surfaces like email scanners and URL blacklists, and red teaming should shift towards early warning of adversary capabilities.

E-Mail Course On Conformal Prediction

Mindful Modeler • 479 implied HN points • 13 Dec 22

🚌 Education Machine Learning

Conformal prediction turns point predictions into prediction sets with a probability guarantee of covering the true outcome, working for any model without requiring a distribution assumption.
The 5-week email course on conformal prediction offers a free, convenient way to learn about this uncertainty quantification method.
Resources like Valeriy's list on conformal prediction and an academic introduction paper can be helpful for diving into and understanding conformal prediction.

Data Science Weekly - Issue 491

Data Science Weekly Newsletter • 419 implied HN points • 21 Apr 23

🕹 Technology Machine Learning

AI academics are facing challenges keeping up with private sector investments. It's important for them to find survival strategies to remain competitive.
There are ongoing discussions about the rapid progress in machine learning and how it can be overwhelming for developers. Many are sharing thoughts on adapting to this fast-paced change.
Visualizing neural networks properly can help clarify concepts. There is a push for better diagrams to avoid confusion in understanding how these networks function.

If AGI Means Everything People Do... What is it That People Do?

Am I Stronger Yet? • 250 implied HN points • 27 Feb 25

🕹 Technology Machine Learning

There's a big gap between what AIs can do in tests and what they can do in real life. It shows we need to understand the full range of human tasks before predicting AI's future capabilities.
AIs currently struggle with complex tasks like planning, judgment, and creativity. These areas need improvement before they can replace humans in many jobs.
To really know how far AIs can go, we need to focus on the skills they lack and find better ways to measure those abilities. This will help us understand AI's potential.

What AI can do with a toolbox... Getting started with Code Interpreter

One Useful Thing • 1338 implied HN points • 07 Jul 23

🕹 Technology Machine Learning

Code Interpreter by OpenAI democratizes data analysis with advanced AI tools
Code Interpreter decreases errors by working directly with Python code
Code Interpreter allows for versatile problem-solving with AI writing Python code

The Sequence Knowledge #764: Wanna do Synthetic Data? Learn About Rephrasing

TheSequence • 28 implied HN points • 02 Dec 25

🕹 Technology Machine Learning

Rephrasing is important for creating synthetic data. It involves rewriting data samples to keep the meaning while changing the words.
This method helps to make data more diverse and reduces the risk of machines just memorizing it instead of understanding.
You can use rephrasing for different types of data, like text, code, or images, and it saves time and costs compared to getting new data labeled.

RAG, Hallucination & Structure: Research By ServiceNow

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 59 implied HN points • 18 Apr 24

🕹 Technology Machine Learning

ServiceNow is using a method called Retrieval-Augmented Generation (RAG) to help transform user requests in natural language into structured workflows. This aims to improve how easily users can create workflows without needing deep technical knowledge.
By using RAG, they want to reduce 'hallucination', which is when AI generates wrong or irrelevant info, and make the AI more reliable. This is important for gaining user trust in AI systems.
The study also suggests future improvements, like changing output formats for efficiency and streamlining processes so that users can see steps one at a time, making it easier to follow along.

Data at Depth 11: Diversity, Adversity, Case Study With GPT-4 and External Validation

Data at Depth • 59 implied HN points • 18 Apr 24

🕹 Technology Machine Learning

Documenting and analyzing your journey as a creator can help identify patterns of growth and areas for improvement, like diversification across social media platforms.
Engaging in strategic thinking, research, and creation can lead to significant accomplishments, such as getting articles published and boosted, validating your skills as a writer.
When using tools like GPT-4 for tasks like title generation, it's crucial to validate their output externally to ensure accuracy and effectiveness.

China’s DeepSeek Adds a Weird New Data Point to The AI Race

Am I Stronger Yet? • 282 implied HN points • 30 Jan 25

🕹 Technology Machine Learning

DeepSeek's new AI model, r1, shows impressive reasoning abilities, challenging larger competitors despite its smaller budget and team. It proves that smaller companies can contribute significantly to AI advancements.
The cost of training r1 was much lower than similar models, potentially signaling a shift in how AI models might be developed and run in the future. This could allow more organizations to participate in AI development without needing huge budgets.
DeepSeek's approach, including releasing its model weights for public use, opens up the possibility for further research and innovation. This could change the landscape of AI by making powerful tools more accessible to everyone.

Should we stop interpreting ML models because XAI methods are imperfect?

Mindful Modeler • 199 implied HN points • 31 Oct 23

🕹 Technology Machine Learning

Don't let a pursuit of perfection in interpreting ML models hinder progress. It's important to be pragmatic and make decisions even in the face of imperfect methods.
Consider the balance of benefits and risks when interpreting ML models. Imperfect methods can still provide valuable insights despite their limitations.
While aiming for improvements in interpretability methods, it's practical to use the existing imperfect methods that offer a net benefit in practice.

Revolutionising Energy Storage: The AI and Experimental Design Fusion in Battery Tech

Intercalation Station • 139 implied HN points • 24 Jan 24

🕹 Technology Machine Learning

The use of machine learning and adaptive experimental design is revolutionizing battery technology for more efficient, reliable, and sustainable energy storage solutions.
Machine learning enhances consumer electronics by optimizing battery life and performance, showing practical benefits in devices like smartphones and electric vehicles.
The combination of machine learning and adaptive experimental design leads to quicker research and innovation in battery technology, making advancements more tailored, responsive, and impactful across industries.

Weekly Top Picks #96

The Algorithmic Bridge • 276 implied HN points • 03 Feb 25

🕹 Technology Machine Learning

OpenAI has launched two new AI agents, Operator and Deep Research, which focus on web tasks and detailed reports. Deep Research is particularly useful right now.
OpenAI's o3-mini model is now free and demonstrates strong reasoning capabilities. This shows that powerful AI tools can be accessible to everyone.
AI technology is evolving rapidly, and companies can benefit collectively from its advancements. Telling an AI to think longer can actually improve its performance.

HILL: Solving for LLM Hallucination & Slop

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 39 implied HN points • 23 May 24

🕹 Technology Machine Learning

HILL helps users see when large language models (LLMs) give wrong or misleading answers. It shows which parts of the response might be incorrect.
The system includes different scores that rate the accuracy, credibility, and potential bias of the information. This helps users decide how much to trust the responses.
Feedback from users helped shape HILL's features, making it easier for people to question LLM replies without feeling confused.

Do Large Language Models have a "theory of mind"?

The Counterfactual • 219 implied HN points • 14 Sep 23

🕹 Technology Machine Learning

Large language models (LLMs) show some ability to understand the beliefs of other characters in scenarios, indicating a form of Theory of Mind. This means they can predict behaviors based on what a character knows or believes.
However, LLMs don't perform as well as humans on these tasks, suggesting their understanding is not as deep or reliable. They score above chance but below the typical human accuracy.
Research on LLMs and Theory of Mind is ongoing, raising questions about how these models process mental states compared to humans and if traditional tests for mentalizing are sufficient.