The hottest Data science Substack posts right now

And their main takeaways

DeepSeek: The View from China

ChinaTalk • 2075 implied HN points • 28 Jan 25

DeepSeek is gaining attention in the AI community for its strong performance and efficient use of computing power. Many believe it showcases China’s growing capabilities in AI technology.
The culture at DeepSeek focuses on innovation without immediate monetization, emphasizing the importance of young talent in AI advancements. This approach has differentiated them from larger tech firms.
Despite initial success, there are still concerns about the long-term sustainability of AI business models. The demand for computing power is high, and no company has enough to meet the future needs.

The Sequence Research #543: The Leaderboard Illusion Challenges Chatbot Arena Type Benchmarks

TheSequence • 119 implied HN points • 16 May 25

🕹 Technology AI Machine Learning Benchmarks Data science Research

Leaderboards in AI help direct research by showing who is doing well, but they can also create problems. They might not show the whole picture of how models really perform.
The Chatbot Arena is a way to judge AI models based on user choices, but it has issues that make it unfair. Some big labs can take advantage of the system more than smaller ones.
To make AI evaluations better, there need to be rules that ensure fairness and transparency. This way, everyone gets a fair chance in the AI race.

May Progress Prizes and Updates to Tooling

Vesuvius Challenge • 9 implied HN points • 13 Jun 25

🕹 Technology Software Engineering Data science Open Source Development

The Vesuvius Challenge team is improving their tools for handling scroll data. They're making it easier for people to process large datasets without needing advanced tech skills.
Philip Allgaier made significant updates to the VC3D tool, including fixing memory issues and making it easier to install and use. This will help users have a smoother experience.
New features like freehand drawing and better options for data analysis have been added, which will boost productivity for those working with the VC3D tool.

Data Science Weekly - Issue 563

Data Science Weekly Newsletter • 139 implied HN points • 05 Sep 24

🕹 Technology Data science Artificial Intelligence Machine Learning Data Engineering Data Visualization

AI prompt engineering is becoming more important, and experts share helpful tips on how to improve your skill in this area.
Researchers in AI should focus on making an impact through their work by creating open-source resources and better benchmarks.
Data quality is a common concern in many organizations, yet many leaders struggle to prioritize it properly and invest in solutions.

Data Science Weekly - Issue 562

Data Science Weekly Newsletter • 179 implied HN points • 29 Aug 24

🕹 Technology Data science Artificial Intelligence Machine Learning Data Engineering Statistics

Distributed systems are changing a lot. This affects how we operate and program these systems, making them more secure and easier to manage.
Statistics are really important in everyday life, even if we don't see it. Talks this year aim to inspire students to understand and appreciate statistics better.
Understanding how AI models work internally is a growing field. Many AI systems are complex, and researchers want to learn how they make decisions and produce outputs.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Issue #14 - The Forgotten Guiding Role of Data Modelling

The Data Ecosystem • 659 implied HN points • 14 Jul 24

🕹 Technology Data science Data Management Information Systems Business Intelligence Database Design

Data modeling is like a blueprint for organizing information. It helps people and machines understand data, making it easier for businesses to make decisions.
There are different types of data models, including conceptual, logical, and physical models. Each type serves a specific purpose and helps bridge business needs with data organization.
Not having a structured data model can lead to confusion and problems. It's important for organizations to invest in good data modeling to improve data quality and business outcomes.

Another one

benn.substack • 1534 implied HN points • 31 Jan 25

🕹 Technology AI Models Software Innovation Data science Market Trends

DeepSeek's rapid impact shows that new AI models can quickly disrupt industries. It proves that creating advanced AI is no longer just for big companies with lots of resources.
Consumers want more than just better technology; they want a range of AI tools that can do different tasks and integrate with their daily lives. People are looking for a single place to access various AI models.
The rise of many unique AI models means we don't know how they will change our world. Just as social media transformed society in unexpected ways, AI could lead to surprising new possibilities and challenges.

DeepSeek R1's recipe to replicate o1 and the future of reasoning LMs

Democratizing Automation • 1717 implied HN points • 21 Jan 25

🕹 Technology Artificial Intelligence Machine Learning Open Source Data science Reinforcement Learning

DeepSeek R1 is a new reasoning language model that can be used openly by researchers and companies. This opens up opportunities for faster improvements in AI reasoning.
The training process for DeepSeek R1 included four main stages, emphasizing reinforcement learning to enhance reasoning skills. This approach could lead to better performance in solving complex problems.
Price competition in reasoning models is heating up, with DeepSeek R1 offering lower rates compared to existing options like OpenAI's model. This could make advanced AI more accessible and encourage further innovations.

OK, I can partly explain the LLM chess weirdness now

DYNOMIGHT INTERNET NEWSLETTER • 796 implied HN points • 21 Nov 24

🕹 Technology AI LLMs Machine Learning Data science Chess

LLMs like `gpt-3.5-turbo-instruct` can play chess well, but most other models struggle. Using specific prompts can improve their performance.
Providing legal moves to LLMs can actually confuse them. Instead, repeating the game before making a move helps them make better decisions.
Fine-tuning and giving examples both improve chess performance for LLMs, but combining them may not always yield the best results.

Why reasoning models will generalize

Democratizing Automation • 1535 implied HN points • 28 Jan 25

🕹 Technology AI Machine Learning Data science Natural Language Processing Computing

Reasoning models are designed to break down complex problems into smaller steps, helping them solve tasks more accurately, especially in coding and math. This approach makes it easier for the models to manage difficult questions.
As reasoning models develop, they show promise in various areas beyond their initial focus, including creative tasks and safety-related situations. This flexibility allows them to perform better in a wider range of applications.
Future reasoning models will likely not be perfect for every task but will improve over time. Users may pay more for models that deliver better performance, making them more valuable in many sectors.

8 Insights to Make Sense of OpenAI o3

The Algorithmic Bridge • 424 implied HN points • 23 Dec 24

🕹 Technology AI Machine Learning Computing Innovation Data science

OpenAI's new model, o3, has demonstrated impressive abilities in math, coding, and science, surpassing even specialists. This is a rare and significant leap in AI capability.
There are many questions about the implications of o3, including its impact on jobs and AI accessibility. Understanding these questions is crucial for navigating the future of AI.
The landscape of AI is shifting, with some competitors likely to catch up, while many will struggle. It's important to stay informed to see where things are headed.

A Visual Guide to Mamba and State Space Models

Exploring Language Models • 3942 implied HN points • 19 Feb 24

🕹 Technology Artificial Intelligence Machine Learning Data science Software Development Algorithm Design

Mamba is a new modeling technique that aims to improve language processing by using state space models instead of the traditional transformer approach. It focuses on keeping essential information while being efficient in handling sequences.
Unlike transformers, Mamba allows for selective attention, meaning it can choose which parts of the input to focus on. This makes it potentially better at understanding context and relevant information.
The architecture of Mamba is designed to be hardware-friendly, helping it to perform well without excessive resource use. It uses techniques like kernel fusion and recomputation to optimize speed and memory use.

Juicy Research Ideas and How to Find them?

AI Research & Strategy • 297 implied HN points • 01 Sep 24

🕹 Technology AI Research Idea Generation Machine Learning Data science Academic Publishing

People often find AI research ideas by reading papers, talking to experts, or browsing online platforms like Twitter and GitHub. These are effective ways to spark inspiration.
There are various strategies for generating AI research ideas, such as inventing new tasks, improving existing methods, or exploring gaps in current research. Each approach can lead to publishing valuable findings.
Building better AI research assistants can involve encoding these idea-generation strategies into their programming. This could make them more effective in supporting researchers.

The Unreasonable Impact of Gradient Checkpointing for Fine-tuning LLMs

The Kaitchup – AI on a Budget • 79 implied HN points • 03 Oct 24

🕹 Technology AI Machine Learning Data science Programming Computing

Gradient checkpointing helps to reduce memory usage during fine-tuning of large language models by up to 70%. This is really important because managing large amounts of memory can be tough with big models.
Activations, which are crucial for training models, can take up over 90% of the memory needed. Keeping track of these is essential for successfully updating the model's weights.
Even though gradient checkpointing helps save memory, it might slow down training a bit since some activations need to be recalculated. It's a trade-off to consider when choosing methods for model training.

LLMs Fight With Both Hands Tied Behind Their Back

Am I Stronger Yet? • 313 implied HN points • 27 Dec 24

🕹 Technology Artificial Intelligence Machine Learning Programming Data science Software Development

Large Language Models (LLMs) like o3 are becoming better at solving complex math and coding problems, showing impressive performance compared to human competitors. They can tackle hard tasks with many attempts, which is different from how humans might solve them.
Despite their advances, LLMs struggle with tasks that require visual reasoning or creativity. They often fail to understand spatial relationships in images because they process information in a linear way, making it hard to work with visual puzzles.
LLMs rely heavily on knowledge in their 'heads' and do not have access to real-world knowledge. When they gain access to more external tools, their performance could improve significantly, potentially changing how they solve various problems.

Did OpenAI Just Solve Abstract Reasoning?

AI: A Guide for Thinking Humans • 344 implied HN points • 23 Dec 24

🕹 Technology AI Machine Learning Computing Data science Research

OpenAI's new model, o3, showed impressive results on tough reasoning tasks, achieving accuracy levels that could compete with human performance. This signals significant advancements in AI's ability to reason and adapt.
The ARC benchmark tests how well machines can recognize and apply abstract rules, but recent results suggest some solutions may rely more on extensive compute than true understanding. This raises questions about whether AI is genuinely learning abstract reasoning.
As AI continues to improve, the ARC benchmark may need updates to push its limits further. New features could include more complex tasks and better ways to measure how well AI can generalize its learning to new situations.

Differential knowledge interconnection

Engineering Ideas • 39 implied HN points • 12 Oct 24

🕹 Technology AI Data science Knowledge Management Ethics Sustainability

Not all AI technologies are harmful. Some can help produce good knowledge that supports a sustainable future, while others might exploit flaws in society.
Good knowledge helps connect and understand well-being, which is crucial for a sustainable civilization. It's important to have interconnected knowledge about all moral patients.
AI capabilities that promote this interconnected knowledge are likely beneficial. However, there's a risk of technology dehumanizing society if not handled carefully.

Report: OpenAI Spends Millions a Year Miscounting the R's in 'Strawberry'

The Algorithmic Bridge • 573 implied HN points • 22 Nov 24

🕹 Technology Artificial Intelligence Machine Learning Tech Ethics Data science Software Development

OpenAI has spent a lot of money trying to fix an issue with counting the letter R in the word 'strawberry.' This problem has caused a lot of confusion among users.
The CEO of OpenAI thinks the problem is silly but feels it's important to address because users are concerned. They are also looking into redesigning how their models handle letter counting.
Some employees joked about extreme solutions like eliminating red fruits to avoid the R issue. They are also thinking of patches to improve letter counting, but it's clear they have more work to do.

Is this a career?

benn.substack • 1713 implied HN points • 13 Dec 24

💼 Business Analytics Data science Careers Expertise Technology

Getting good at something often just takes a little focused effort over time. Many people don't actively try to improve, so they stay at a decent skill level rather than reaching their full potential.
In fields like data analytics, it's essential to specialize to truly excel. Being a generalist might keep you busy, but it can lead to a career without a clear direction or growth.
To stand out and achieve more in their careers, people need to identify a specific area of expertise and commit to it. Relying on being 'good at data' isn't usually enough to make a significant impact.

GPTs Are Maxed Out

The Algorithmic Bridge • 647 implied HN points • 11 Nov 24

🕹 Technology AI Computing Machine Learning Data science Software Development

AI companies are hitting limits with current models. Simply making AI bigger isn't creating better results like it used to.
The upcoming models, like Orion, may not meet the high expectations set by previous versions. Users want more dramatic improvements and are getting frustrated.
A new approach in AI may focus on real-time thinking, allowing models to give better answers by taking a bit more time, though this could test users' patience.

Not All Layers Are Equal

Gonzo ML • 63 implied HN points • 31 Jan 25

🕹 Technology AI Research Machine Learning Data science Neural Networks Computational Theory

Not every layer in a neural network is equally important. Some layers play a bigger role in getting the right results, while others have less impact.
Studying how information travels through different layers can reveal interesting patterns. It turns out layers often work together to make sense of data, rather than just acting alone.
Using methods like mechanistic interpretability can help us understand neural networks better. By looking closely at what's happening inside the model, we can learn which parts are doing what.

Data Science Weekly - Issue 559

Data Science Weekly Newsletter • 219 implied HN points • 08 Aug 24

🕹 Technology Data science AI Machine Learning Software Development Statistics

Camera calibration is crucial in sports analysis. It helps track players' movements accurately by mapping video frame positions to real field locations.
Understanding the context of data is important for responsible data work. Datasets need good documentation and stories to highlight their historical and social backgrounds.
There's a new, free encyclopedia for learning about cognitive science. It offers easy-to-read articles on various topics for students and researchers.

Data Science Weekly - Issue 561

Data Science Weekly Newsletter • 139 implied HN points • 22 Aug 24

🕹 Technology Data science AI Machine Learning Data Engineering Visualization

When building web applications, using Postgres for data storage is a good default choice. It's reliable and widely used.
A new study shows that agents can learn useful skills without rewards or guidance. They can explore and develop abilities just from observing a goal.
The list of important books and resources in Bayesian statistics is being compiled. It's a way to recognize influential ideas in this field.

LCM: Large Concept Model

Gonzo ML • 189 implied HN points • 04 Jan 25

🕹 Technology Artificial Intelligence Machine Learning Natural Language Processing Data science Computational Models

The Large Concept Model (LCM) aims to improve how we understand and process language by focusing on concepts instead of just individual words. This means thinking at a higher level about what ideas and meanings are being conveyed.
LCM uses a system called SONAR to convert sentences into a stable representation that can be processed and then translated back into different languages or forms without losing the original meaning. This creates flexibility in how we communicate.
This approach can handle long documents more efficiently because it represents ideas as concepts, making processing easier. This could improve applications like summarization and translation, making them more effective.

The ghosts that live in my garage

Basta’s Notes • 122 implied HN points • 13 Jan 25

🕹 Technology Machine Learning Artificial Intelligence Self-driving cars Data science Software Development

Machine learning models are good at spotting patterns that humans might miss. This means they can make predictions and organize data in ways that are impressive and often very useful.
However, machine learning can struggle with unclear or messy data. This fuzziness can lead to mistakes, like misidentifying objects or giving unexpected results.
Not every problem needs a machine learning solution, and sometimes simpler methods work better and are more effective. It's important to think carefully about whether machine learning is truly the best tool for the job.

Data Science Weekly - Issue 558

Data Science Weekly Newsletter • 219 implied HN points • 01 Aug 24

🕹 Technology Data science Machine Learning AI Data Visualization Statistical Methods

Data science and AI are rapidly evolving fields with plenty of interesting developments. Staying updated with the latest articles and news can really help you understand these changes better.
Effective communication is key in data science. Using intuitive methods and visuals can make complex concepts easier to grasp for everyone.
Using tools and methods like quantization can help make large models more accessible. It's important to find efficient ways to work with vast amounts of data to improve performance.

Transformer^2: Self-adaptive LLMs

Gonzo ML • 63 implied HN points • 27 Jan 25

🕹 Technology Artificial Intelligence Machine Learning Data science Computing Software Development

Transformer^2 uses a new method for adapting language models that makes it simpler and more efficient than fine-tuning. Instead of retraining the whole model, it adjusts specific parts, which saves time and resources.
The approach breaks down weight matrices through a process called Singular Value Decomposition (SVD), allowing the model to identify and enhance its existing strengths for various tasks.
At test time, Transformer^2 can adapt to new tasks in two passes, first assessing the situation and then applying the best adjustments. This method shows improvements over existing techniques like LoRA in both performance and parameter efficiency.

Data Science Weekly - Issue 560

Data Science Weekly Newsletter • 139 implied HN points • 15 Aug 24

🕹 Technology Data science AI Machine Learning Software Development Programming

The Turing Test raises questions about what it means for a computer to think, suggesting that if a computer behaves like a human, we might consider it intelligent too.
Creating a multimodal language model involves understanding different components like transformers, attention mechanisms, and learning techniques, which are essential for advanced AI systems.
A recent study tested if astrologers can really analyze people's lives using astrology, addressing the ongoing debate about the legitimacy of astrology among the public.

What Did You Think Getting Closer to AGI Would Be Like?

The Algorithmic Bridge • 318 implied HN points • 07 Dec 24

🕹 Technology Artificial Intelligence Machine Learning Software Development Computing Data science

OpenAI's new model, o1, is not AGI; it's just another step in AI development that might not lead us closer to true general intelligence.
AGI should have consistent intelligence across tasks, unlike current AI, which can sometimes perform poorly on simple tasks and excel on complex ones.
As we approach AGI, we might feel smaller or less significant, reflecting how humans will react to advanced AI like o1, even if it isn’t AGI itself.

JAX things to watch for in 2025

Gonzo ML • 378 implied HN points • 26 Nov 24

🕹 Technology AI Software Programming Data science Machine Learning

The new NNX API is set to replace the older Linen API for building neural networks with JAX. It simplifies the coding process and offers better performance options.
The shard_map feature improves multi-device computation by allowing better handling of data. It’s a helpful evolution for developers looking for precise control over their parallel computing tasks.
Pallas is a new JAX tool that lets users write custom kernels for GPUs and TPUs. This allows for more specialized and efficient computation, particularly for advanced tasks like training large models.

OpenAI Announces o1 Model And ChatGPT Pro ($200/Mo)

The Algorithmic Bridge • 329 implied HN points • 05 Dec 24

🕹 Technology AI Models Machine Learning Software Development Data science Innovation

OpenAI has launched a new AI model called o1, which is designed to think and reason better than previous models. It can now solve questions more accurately and is faster at responding to simpler problems.
ChatGPT Pro is a new subscription tier that costs $200 a month. It provides unlimited access to advanced models and special features, although it might not be worth it for average users.
o1 is not just focused on math and coding; it's also designed for everyday tasks like writing. OpenAI claims it's safer and more compliant with their policies than earlier models.

The CAP Theorem of Clustering: Why Every Algorithm Must Sacrifice Something

Confessions of a Code Addict • 529 implied HN points • 29 Oct 24

🕹 Technology Algorithms Data science Software Engineering Mathematics Research

Clustering algorithms can never be perfect and always require trade-offs. You can't have everything, so you have to choose what matters most for your project.
There are three key properties that clustering should ideally have: scale-invariance, richness, and consistency, but no algorithm can achieve all three simultaneously.
Understanding these sacrifices helps in making better decisions when using clustering methods. Knowing what to prioritize can lead to more effective data analysis.

Grok 3 and an accelerating AI roadmap

Democratizing Automation • 554 implied HN points • 18 Feb 25

🕹 Technology AI Machine Learning Innovation Software Data science

Grok 3 is a new AI model that's designed to compete with existing top models. It aims to improve quickly, with updates happening daily.
There's increasing competition in the AI field, which is pushing companies to release their models faster, leading to more powerful AI becoming available to users sooner.
Current evaluations of AI models might not be very practical or useful for everyday life. It's important for companies to share more about their evaluation processes to help users understand AI advancements.

April Progress Prizes / Updates

Vesuvius Challenge • 38 implied HN points • 23 May 25

🕹 Technology Software Engineering Data science Algorithms Research

New techniques for analyzing scroll shapes are improving the way we handle and segment data. This means we can understand and work with historical documents much better.
There have been exciting updates in scroll deformation methods, which can help in restoring the original shapes of ancient scrolls. This makes analyzing them easier and more accurate.
The new developments in fiber analysis provide important information that can help reconstruct ancient writing surfaces. This can lead to better ways to unroll and study papyrus materials.

Diffusion Models are Evolutionary Algorithms

Gonzo ML • 441 implied HN points • 09 Nov 24

🕹 Technology AI Algorithms Evolution Machine Learning Data science

Diffusion models and evolutionary algorithms both involve changing data over time through processes like selection and mutation, which can lead to new and improved results.
The new algorithm called Diffusion Evolution can find multiple good solutions at once, unlike traditional methods that often focus on one single best solution.
There are exciting connections between learning and evolution, hinting that they may fundamentally operate in similar ways, which opens up many questions about future AI developments.

LLMs Know More Than What They Say

LLMs for Engineers • 120 HN points • 15 Aug 24

🕹 Technology AI Machine Learning Data science Software Development Computing

Using latent space techniques can improve the accuracy of evaluations for AI applications without requiring a lot of human feedback. This approach saves time and resources.
Latent space readout (LSR) helps in detecting issues like hallucinations in AI outputs by allowing users to adjust the sensitivity of detection. This means it can catch more errors if needed, even if that results in some false alarms.
Creating customized evaluation rubrics for AI applications is essential. By gathering targeted feedback from users, developers can create more effective evaluation systems that align with specific needs.

From Average To Exceptional: 5 Skills To Help You Become The Data Expert Everyone Wants To Work With

SeattleDataGuy’s Newsletter • 494 implied HN points • 19 Feb 25

💼 Business Data science Career development Communication Analytics Team Collaboration

Always focus on the real problem behind a request, not just what is being asked. This helps you deliver better solutions that actually meet the business needs.
Using clear frameworks can help organize your thoughts and make complex investigations easier. A structured approach leads to clearer communication and better results.
Keep your communication simple and focused on what matters to your stakeholders. This helps everyone stay on the same page and reduces confusion.

The Sequence Knowledge #545 : Beyond Language, Learning About Multimodal Benchmarks

TheSequence • 28 implied HN points • 20 May 25

🕹 Technology AI Machine Learning Computer Vision Data science Benchmarking

Multimodal benchmarks are tools to evaluate AI systems that use different types of data like text, images, and audio. They help ensure that AI can handle complex tasks that combine these inputs effectively.
One important benchmark in this area is called MMMU, which tests AI on 11,500 questions across various subjects. This benchmark needs AI to work with text and visuals together, promoting deeper understanding rather than just shortcuts.
The design of these benchmarks, like MMMU, helps reveal how well AI understands different topics and where it may struggle. This can lead to improvements in AI technology.

The Sequence Radar #477: The R1 Moment

TheSequence • 546 implied HN points • 26 Jan 25

🕹 Technology AI Machine Learning Open Source Innovation Data science Research

DeepSeek-R1 is a new AI model that shows it can perform as well or better than big-name AI models but at a much lower cost. This means smaller companies can now compete in AI innovation without needing huge budgets.
The way DeepSeek-R1 is trained is different from traditional methods. It uses a new approach called reinforcement learning, which helps the model learn smarter reasoning skills without needing a ton of supervised data.
The open-source nature of DeepSeek-R1 means anyone can access and use the code for free. This encourages collaboration and allows more people to innovate in AI, making technology more accessible to everyone.

Python Data Engineering, July 2024

Monthly Python Data Engineering • 179 implied HN points • 25 Jul 24

🕹 Technology Software Development Data Engineering Open Source Programming Languages Data science

The Python Data Engineering newsletter focuses on key updates and tools for building data engineering projects, rather than just data science.
This month showcased rapid development in projects like Narwhals and Polars, with Narwhals making 26 releases and Polars reaching version 1.0.0.
Several other libraries, such as Great Tables and Dask, also had important updates, making it a busy month for Python data engineering tools.