The hottest Data science Substack posts right now

And their main takeaways

Some ideas for what comes next

Democratizing Automation • 529 implied HN points • 23 Jun 25

🕹 Technology AI Models Machine Learning Data science Software Development Tech Trends

OpenAI's new model, o3, is really good at finding information quickly, like a determined search dog. It's unique compared to other models, and many are curious if others will match its capabilities soon.
AI agents, like Claude Code, are improving quickly and can solve complex tasks. They have made many small changes that boost their performance, which is exciting for users.
The trend in AI models is slowing down in terms of size but improving in efficiency. Instead of just making bigger models, companies are focusing on optimizing what they already have.

Five quick updates about that Apple reasoning paper that people can’t stop talking about

Marcus on AI • 9485 implied HN points • 17 Jun 25

🕹 Technology Artificial Intelligence Machine Learning Software Engineering Data science Computational linguistics

A recent paper questions if large language models can really reason deeply, suggesting they struggle with even moderate complexity. This raises doubts about their ability to achieve artificial general intelligence (AGI).
Some responses to this paper have been criticized as weak or even jokes, yet many continue to share them as if they are serious arguments. This shows confusion in the debate surrounding AI reasoning capabilities.
New research supports the idea that AI systems perform poorly when faced with unfamiliar challenges, not just sticking to problems they are already good at solving.

Seven replies to the viral Apple reasoning paper – and why they fall short

Marcus on AI • 16836 implied HN points • 12 Jun 25

🕹 Technology AI Machine Learning Data science Computing Software

Large reasoning models (LRMs) struggle with complex tasks, and while it's true that humans also make mistakes, we expect machines to perform better. The Apple paper highlights that LLMs can't be trusted for more complicated problems.
Some rebuttals argue that bigger models might perform better, but we can't predict which models will succeed in various tasks. This leads to uncertainty about how reliable any model really is.
Despite prior knowledge that these models generalize poorly, the Apple paper emphasizes the seriousness of the issue and shows that more people are finally recognizing the limitations of current AI technology.

Google Search, SearchGPT, and SerpAPI

Encyclopedia Autonomica • 19 implied HN points • 02 Nov 24

🕹 Technology Search Engines AI Information Retrieval Web Development Data science

Google Search is becoming less reliable due to junk content and SEO tricks, making it harder to find accurate information.
SearchGPT and similar tools are different from traditional search engines. They retrieve information and summarize it instead of just showing ranked results.
There's a risk that new search tools might not always provide neutral information. It's important to ensure that users can still find quality sources without bias.

AI #105: Hey There Alexa

Don't Worry About the Vase • 1120 implied HN points • 27 Feb 25

🕹 Technology AI Machine Learning Automation Robotics Data science

A new version of Alexa, called Alexa+, is coming soon. It will be much smarter and can help with more tasks than before.
AI tools can help improve coding and other work tasks, giving users more productivity but not always guaranteeing quality.
There's a lot of excitement about how AI is changing jobs and tasks, but it also raises concerns about safety and job replacement.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

The Weekly Kaitchup #65

The Kaitchup – AI on a Budget • 59 implied HN points • 01 Nov 24

🕹 Technology AI Models Machine Learning Natural Language Text-to-Speech Data science

SmolLM2 offers alternatives to popular models like Qwen2.5 and Llama 3.2, showing good performance with various versions available.
The Layer Skip method improves the speed and efficiency of Llama models by processing some layers selectively, making them faster without losing accuracy.
MaskGCT is a new text-to-speech model that generates high-quality speech without needing text alignment, providing better results across different benchmarks.

All paths point downhill

arg min • 218 implied HN points • 31 Oct 24

🕹 Technology Algorithms Optimization Data science Machine Learning Mathematics

In optimization, there are three main approaches: local search, global optimization, and a method that combines both. They all aim to find the best solution to minimize a function.
Gradient descent is a popular method in optimization that works like local search, by following the path of steepest descent to improve the solution. It can also be viewed as a way to solve equations or approximate values.
Newton's method, another optimization technique, is efficient because it converges quickly but requires more computation. Like gradient descent, it can be interpreted in various ways, emphasizing the interconnectedness of optimization strategies.

If You’re New to Data, Read This Before You Build Anything

SeattleDataGuy’s Newsletter • 365 implied HN points • 19 Jun 25

🕹 Technology Data science Communication Career growth Technical Skills

It's better to work with other experienced engineers early in your career. This way, you can learn from their decisions and improve your skills more quickly.
Don't get distracted by flashy tech trends or buzzwords. Focus on solving real business problems instead of getting caught up in the hype.
Communication is key in data roles. Make sure you understand your audience and always lead with the main point when sharing your work.

Hallucinations Are Fine, Actually

Artificial Ignorance • 92 implied HN points • 04 Mar 25

🕹 Technology AI Machine Learning Software Development Data science Automation

AI models can often make mistakes or 'hallucinate' by providing wrong information confidently. It's important for humans to check AI output especially for important tasks.
Even though AI hallucinations are a challenge, they're seen as something we can work to improve rather than an insurmountable problem.
Instead of aiming for AI to do everything on its own, we should use it as a tool to help us do our jobs better, understanding that we need to collaborate with it.

The Sequence Research #663: The Illusion of Thinking, Inside the Most Controversial AI Paper of Recent Weeks

TheSequence • 105 implied HN points • 13 Jun 25

🕹 Technology AI Research Innovation Computing Data science

Large Reasoning Models (LRMs) can show improved performance by simulating thinking steps, but their ability to truly reason is questioned.
Current tests for LLMs often miss the mark because they can have flaws like data contamination, not really measuring how well the models think.
New puzzle environments are being introduced to better evaluate these models by challenging them in a structured way while keeping the logic clear.

GPT-4.5 Feels Like a Letdown But It’s OpenAI’s Biggest Bet Yet

The Algorithmic Bridge • 605 implied HN points • 28 Feb 25

🕹 Technology AI Machine Learning Software Innovation Data science

GPT-4.5 is not as impressive as expected, but it's part of a plan for bigger advancements in the future. OpenAI is using this model to build a better foundation for what's to come.
Despite being larger and more expensive, GPT-4.5 isn't leading in new capabilities compared to older models. It's more focused on creativity and communication, which might not appeal to all users.
OpenAI wants to improve the basic skills of AI rather than just aiming for high scores in tests. This step back is meant to ensure future models are smarter and more capable overall.

Build AI or Be Buried By Those Who Do

Contemplations on the Tree of Woe • 3574 implied HN points • 30 May 25

🕹 Technology AI Machine Learning Innovation Automation Data science

There are three main views on AI: believers who think it will change everything for the better, skeptics who see it as just fancy technology, and doomers who worry it could end badly for humanity. Each group has different ideas about what AI will mean for the future.
The belief among AI believers is that AI will become a big part of our lives, doing many tasks better than humans and reshaping many industries. They see it as a revolutionary change that will be everywhere.
Many think that if we don’t build our own AI, the narrative and values that shape AI will be dominated by one ideology, which could be harmful. The idea is that we need balanced development of AI, representing different views to ensure freedom and diversity in thought.

How to Think About ChatGPT

Holly’s Newsletter • 2916 implied HN points • 18 Oct 24

🕹 Technology Artificial Intelligence Machine Learning Software Development Computing Data science

ChatGPT and similar models are not thinking or reasoning. They are just very good at predicting the next word based on patterns in data.
These models can provide useful information but shouldn't be trusted as knowledge sources. They reflect training data biases and simply mimic language patterns.
Using ChatGPT can be fun and helpful for brainstorming or getting starting points, but remember, it's just a tool and doesn't understand the information it presents.

Basic Linear Algebra Subprogramming

arg min • 178 implied HN points • 29 Oct 24

🕹 Technology Computing Mathematics Optimization Data science Algorithms

Understanding how optimization solvers work can save time and improve efficiency. Knowing a bit about the tools helps you avoid mistakes and make smarter choices.
Nonlinear equations are harder to solve than linear ones, and methods like Newton's help us get approximate solutions. Iteratively solving these systems is key to finding optimal results in optimization problems.
The speed and efficiency of solving linear systems can greatly affect computational performance. Organizing your model in a smart way can lead to significant time savings during optimization.

HOISTED FROM COMMENTS: RAFAEL KAUFMANN: Carving Nature at the Joints: Faithful Representation, the Platonic Dream, & the Unreasonable Near-Success of GPT LLM MAMLMs

Brad DeLong's Grasping Reality • 69 implied HN points • 25 Jun 25

🕹 Technology AI Machine Learning Philosophy Data science Automation

Machines, like large language models, can imitate human language because they find patterns hidden in how we express ourselves. They simplify the chaos of our words into something easier to understand.
Even though these models are good at predicting responses, they struggle with truly understanding the world. They can replicate language well, but grasping the deeper meaning remains a challenge.
The hope is that with better training and understanding causal relationships, these models could evolve to not only imitate but truly comprehend the world around them.

The Impact of the Calibration Dataset for AutoRound and AWQ Quantization

The Kaitchup – AI on a Budget • 39 implied HN points • 31 Oct 24

🕹 Technology AI Data science Machine Learning Quantization Model optimization

Quantization helps reduce the size of large language models, making them easier to run, especially on consumer GPUs. For instance, using 4-bit quantization can shrink a model's size by about a third.
Calibration datasets are crucial for improving the accuracy of quantization methods like AWQ and AutoRound. The choice of the dataset impacts how well the quantization performs.
Most quantization tools use a default English-language dataset, but results can vary with different languages and datasets. Testing various options can lead to better outcomes.

A Visual Guide to Mixture of Experts (MoE)

Exploring Language Models • 3289 implied HN points • 07 Oct 24

🕹 Technology Artificial Intelligence Machine Learning Data science Neural Networks Computational Models

Mixture of Experts (MoE) uses multiple smaller models, called experts, to help improve the performance of large language models. This way, only the most relevant experts are chosen to handle specific tasks.
A router or gate network decides which experts are best for each input. This selection process makes the model more efficient by activating only the necessary parts of the system.
Load balancing is critical in MoE because it ensures all experts are trained equally, preventing any one expert from becoming too dominant. This helps the model to learn better and work faster.

bitnet.cpp: Efficient Inference with 1-Bit LLMs on your CPU

The Kaitchup – AI on a Budget • 179 implied HN points • 28 Oct 24

🕹 Technology Artificial Intelligence Software Development Machine Learning Open Source Data science

BitNet is a new type of AI model that uses very little memory by representing each parameter with just three values. This means it uses only 1.58 bits instead of the usual 16 bits.
Despite using lower precision, these '1-bit LLMs' still work well and can compete with more traditional models, which is pretty impressive.
The software called 'bitnet.cpp' allows users to run these AI models on normal computers easily, making advanced AI technology more accessible to everyone.

Which way from here?

benn.substack • 1048 implied HN points • 06 Jun 25

🕹 Technology Data science Analytics Software Artificial Intelligence Business Intelligence

Data tools are getting more advanced, but many people still struggle with knowing how to use them effectively. This means that having the right tools isn't enough if users lack direction.
The industry is shifting focus from traditional analytics towards building AI systems and infrastructure. Companies are now adapting their technologies to support AI applications instead of just analyzing data.
Self-serve BI tools aren't being used as intended because people often don't know what questions to ask. Providing clearer direction and goals might help users make better use of available data.

The rise of reasoning machines

Democratizing Automation • 538 implied HN points • 12 Jun 25

🕹 Technology AI Machine Learning Computing Software Development Data science

Reasoning is when we draw conclusions based on what we observe. Humans experience reasoning differently than AI, but both lack a full understanding of their own processes.
AI models are improving but still struggle with complex problems. Just because they sometimes fail doesn't mean they can't reason; they just might need new methods to tackle tougher challenges.
The debate on whether AI can truly reason often stems from fear of losing human uniqueness. Some critics focus on what AI can't do instead of recognizing its potential, which is growing rapidly.

This Rumor About GPT-5 Changes Everything

The Algorithmic Bridge • 4788 implied HN points • 16 Jan 25

🕹 Technology AI Machine Learning Software Development Data science Tech industry

There's a belief that GPT-5 might already exist but isn't being released to the public. The idea is that OpenAI may be using it internally because it's more valuable that way.
AI labs are focusing on creating smaller and cheaper models that still perform well. This new approach aims to reduce costs while improving efficiency, which is crucial given the rising demand for AI.
The situation is similar across major AI companies like OpenAI and Anthropic, with many facing challenges in producing new models. Instead, they might be opting to train powerful models internally and use them to enhance smaller models for public use.

Weekly Top Picks #99

The Algorithmic Bridge • 191 implied HN points • 24 Feb 25

🕹 Technology AI Research Tech Policy Software Development Data science Innovation

AI labs need to find the right balance between scaling their systems and efficiency in their processes.
There's an AI model that criticized famous figures like Elon Musk and Donald Trump, showing it might lean towards leftist views.
Tyler Cowen believes the slow integration of AI into our society is due to human limitations, not the technology itself.

What comes next with reinforcement learning

Democratizing Automation • 435 implied HN points • 09 Jun 25

🕹 Technology AI Machine Learning Reinforcement Learning Data science Software Development

Reinforcement learning (RL) is getting better at solving tougher tasks, but it's not easy. There's a need for new discoveries and improvements to make these complex tasks manageable.
Continual learning is important for AI, but it raises concerns about safety and can lead to unintended consequences. We need to approach this carefully to ensure the technology is beneficial.
Using RL in sparser domains presents challenges, as the lack of clear reward signals makes improvement harder. Simple methods have worked before, but it’s uncertain if they will work for more complex tasks.

The Sequence Knowledge #560: The Amazing World of Agentic Benchmarks

TheSequence • 49 implied HN points • 10 Jun 25

🕹 Technology AI Machine Learning Software Data science Automation

Agentic benchmarks are new ways to evaluate AI that focus on decision-making rather than just answering questions. They look at how well AI can plan and adapt to different tasks.
Traditional evaluation methods aren't enough for AI that acts like agents. We need tests that measure how AI can handle complex situations and multi-step processes.
One exciting example of these benchmarks is the Web Arena, which helps assess AI's ability to perform tasks on the web. This includes how well they interact with online tools and environments.

Analyze research papers with Gemini 2.0

Gonzo ML • 126 implied HN points • 23 Feb 25

🕹 Technology Artificial Intelligence Machine Learning Natural Language Processing Data science Cloud Computing

Gemini 2.0 models can analyze research papers quickly and accurately, supporting large amounts of text. This means they can handle complex documents like academic papers effectively.
The DeepSeek-R1 model shows that strong reasoning abilities can be developed in AI without the need for extensive human guidance. This could change how future models are trained and developed.
Distilling knowledge from larger models into smaller ones allows for efficient and accessible AI that can perform well on various tasks, which is useful for many applications.

AGI Is Already Here—It’s Just Not Evenly Distributed

The Algorithmic Bridge • 1104 implied HN points • 05 Feb 25

🕹 Technology AI Machine Learning Data science Computing Software

Understanding how to create good prompts is really important. If you learn to ask questions better, you'll get much better answers from AI.
Even though AI models are getting better, good prompting skills are becoming more important. It's like having a smart friend; you need to know how to ask the right questions to get the best help.
The better your prompting skills, the more you'll be able to take advantage of AI. It's not just about the AI's capabilities but also about how you interact with it.

Claude 3.7 and the banality of reasoning

Artificial Ignorance • 117 implied HN points • 25 Feb 25

🕹 Technology Artificial Intelligence Software Development Machine Learning Data science Computer Science

Claude 3.7 introduces a new way to control reasoning, letting users choose how much reasoning power they want. This makes it easier to tailor the AI’s responses to fit different needs.
The competition in AI models is heating up, with many companies launching similar features. This means users can expect similar quality and capabilities regardless of which AI they choose.
Anthropic is focusing on making Claude better for real-world tasks, rather than just excelling in benchmarks. This is important for businesses looking to use AI effectively.

o3, Oh My

Don't Worry About the Vase • 3852 implied HN points • 30 Dec 24

🕹 Technology AI Models Machine Learning Data science Computing Software Engineering

OpenAI's new model, o3, shows amazing improvements in reasoning and programming skills. It's so good that it ranks among the top competitive programmers in the world.
o3 scored impressively on challenging math and coding tests, outperforming previous models significantly. This suggests we might be witnessing a breakthrough in AI capabilities.
Despite these advances, o3 isn't classified as AGI yet. While it excels in certain areas, there are still tasks where it struggles, keeping it short of true general intelligence.

✨ Begun, the US vs. China AI race has

Faster, Please! • 1462 implied HN points • 27 Jan 25

🕹 Technology AI Innovation Investments Data science Market Trends

The AI race between the US and China is heating up, with China's DeepSeek making significant advancements. This situation is causing a lot of nervousness in the stock market.
DeepSeek's new AI model is impressive because it can learn effectively with less hardware investment than previously thought. This could change how companies and investors view AI development costs.
Some experts believe DeepSeek's achievements may signal a big shift in the AI field, showing that the competitive landscape is more unpredictable than it seemed before.

AI & Python #28: The Notebook Used for Data Science and AI Projects

Artificial Corner • 158 implied HN points • 23 Oct 24

🕹 Technology AI Programming Data science Software Tools

Jupyter Notebook is a popular tool for data science that combines live code with visualizations and text. It helps users organize their projects in a single place.
Jupyter Notebook can be improved with extensions, which can add features like code autocompletion and easier cell movement. These tools make coding more efficient and user-friendly.
To install these extensions, you can use specific commands in the command prompt. Once installed, you'll find new options that can help increase your productivity.

Hype Is Not A Data Strategy

SeattleDataGuy’s Newsletter • 365 implied HN points • 05 Jun 25

💼 Business Data Strategy Marketing Tech Trends Data science

Hype around data and AI can distract companies from their real goals. It's important to focus on what data can actually do for your business, instead of getting lost in the trend.
Most businesses don't rely on data as their main product. Even if data can improve their operations, it’s not their primary focus, so the challenge is making data truly useful.
Companies often look up to big tech for data strategies, but they have different resources. Chasing after their methods without understanding your own needs can lead to a misguided strategy.

AI progress has plateaued below GPT-5 level

The Intrinsic Perspective • 31460 implied HN points • 14 Nov 24

🕹 Technology AI Machine Learning Innovation Data science Computing

AI development seems to have slowed down, with newer models not showing a big leap in intelligence compared to older versions. It feels like many recent upgrades are just small tweaks rather than revolutionary changes.
Researchers believe that the improvements we see are often due to better search techniques rather than smarter algorithms. This suggests we may be returning to methods that dominated AI in earlier decades.
There's still a lot of uncertainty about the future of AI, especially regarding risks and safety. The plateau in advancements might delay the timeline for achieving more advanced AI capabilities.

A taxonomy for next-generation reasoning models

Democratizing Automation • 467 implied HN points • 04 Jun 25

🕹 Technology AI Machine Learning Computing Data science Automation

Next-gen reasoning models will focus on skills, calibration, strategy, and abstraction. These abilities help the models solve complex problems more effectively.
Calibrating how difficult a problem is will help models avoid overthinking and make solutions faster and more enjoyable for users.
Planning is crucial for future models. They need to break down complex tasks into smaller parts and manage context effectively to improve their problem-solving abilities.

Nearly Headless BI

davidj.substack • 59 implied HN points • 25 Jun 25

🕹 Technology Data science Business Intelligence Artificial Intelligence Analytics Semantics

Snowflake and Databricks are using a semantic layer, which helps make data easier to understand and access. This is a shift from older methods that relied heavily on text-based commands.
The rise of AI has changed what businesses need from their analytics tools. Now, having a semantic layer is a must for companies that want to stay competitive in agentic analytics.
Headless business intelligence is fading away as companies now blend traditional analytics with smarter, AI-driven tools. This could change how data warehouses and BI tools work together in the future.

Fixing Faulty Gradient Accumulation: Understanding the Issue and Its Resolution

The Kaitchup – AI on a Budget • 159 implied HN points • 21 Oct 24

🕹 Technology AI Machine Learning Data science Model Training Computing

Gradient accumulation helps train large models on limited GPU memory. It simulates larger batch sizes by summing gradients from several smaller batches before updating model weights.
There has been a problem with how gradients were summed during gradient accumulation, leading to worse model performance. This was due to incorrect normalization in the calculation of loss, especially when varying sequence lengths were involved.
Hugging Face and Unsloth AI have fixed the gradient accumulation issue. With this fix, training results are more consistent and effective, which might improve the performance of future models built using this technique.

Grok 3 Beta in Shambles

Marcus on AI • 10750 implied HN points • 19 Feb 25

🕹 Technology Artificial Intelligence Machine Learning Computing Data science Software Development

The new Grok 3 AI isn't living up to its hype. It initially answers some questions correctly but quickly starts making mistakes.
When tested, Grok 3 struggles with basic facts and leaves out important details, like missing cities in geographical queries.
Even with huge investments in AI, many problems remain unsolved, suggesting that scaling alone isn't the answer to improving AI performance.

DeepSeek v3: The Six Million Dollar Model

Don't Worry About the Vase • 2777 implied HN points • 31 Dec 24

🕹 Technology AI Models Machine Learning Data science Computing Tech industry

DeepSeek v3 is a powerful and cost-effective AI model with a good balance between performance and price. It can compete with top models but might not always outperform them.
The model has a unique structure that allows it to run efficiently with fewer active parameters. However, this optimization can lead to challenges in performance across various tasks.
Reports suggest that while DeepSeek v3 is impressive in some areas, it still falls short in aspects like instruction following and output diversity compared to competitors.

AI #97: 4

Don't Worry About the Vase • 2419 implied HN points • 02 Jan 25

🕹 Technology AI Machine Learning Data science Automation Software Development

AI is becoming more common in everyday tasks, helping people manage their lives better. For example, using AI to analyze mood data can lead to better mental health tips.
As AI technology advances, there are concerns about job displacement. Jobs in fields like science and engineering may change significantly as AI takes over routine tasks.
The shift of AI companies from non-profit to for-profit models could change how AI is developed and used. It raises questions about safety, governance, and the mission of these organizations.

AI #98: World Ends With Six Word Story

Don't Worry About the Vase • 1881 implied HN points • 09 Jan 25

🕹 Technology Artificial Intelligence Machine Learning Data science Automation Digital Transformation

AI can offer useful tasks, but many people still don't see its value or know how to use it effectively. It's important to change that mindset.
Companies are realizing that fixed subscription prices for AI services might not be sustainable because usage varies greatly among users.
Many folks are worried about AI despite not fully understanding it. It's crucial to communicate AI's potential benefits and reduce fears around job loss and other concerns.

I spent 6 hours learning how Apache Spark plans the execution for us

VuTrinh. • 659 implied HN points • 10 Sep 24

🕹 Technology Data science Software Engineering Big Data Cloud Computing Machine Learning

Apache Spark uses a system called Catalyst to plan and optimize how data is processed. This system helps make sure that queries run as efficiently as possible.
In Spark 3, a feature called Adaptive Query Execution (AQE) was added. It allows the tool to change its plans while a query is running, based on real-time data information.
Airbnb uses this AQE feature to improve how they handle large amounts of data. This lets them dynamically adjust the way data is processed, which leads to better performance.