The hottest Data science Substack posts right now

And their main takeaways
Category
Top Technology Topics
The Healthtech Initiative 0 implied HN points 13 Dec 24
  1. Vadim Fedotov turned his experience as a basketball player and entrepreneur into a passion for health optimization. He realized that traditional medicine often focuses on treating illness rather than promoting better health.
  2. His company, Bioniq, offers personalized health solutions based on data and user feedback. The goal is to create effective supplements that meet individual needs without unnecessary complexity.
  3. Vadim highlighted the importance of focusing on the first 1,000 customers who believe in your product. These early advocates can be crucial for a startup's success and help build a strong community.
Squirrel Squadron Substack 0 implied HN points 17 Dec 24
  1. Graphs can help visualize motion and speed, making concepts like calculus easier to understand. It's fun to relate math to real-life activities, like driving a car.
  2. Machine learning improves by tweaking weights to reduce errors, similar to adjusting software for better performance. It's like steering a computer program to make it better.
  3. To build successful software, focus on small, frequent changes and measure how well they improve things. This method can lead to big wins in product development.
Squirrel Squadron Substack 0 implied HN points 17 Dec 24
  1. When looking at CVs, it's important to see what candidates did and why it mattered. Focus on real impact instead of fancy buzzwords.
  2. Many candidates use vague phrases that sound good but don't tell you anything meaningful. Look for specific results they achieved and how they benefited customers.
  3. A strong CV should show clear business results, like increasing sales or cutting costs. If it doesn’t do that, it might not be worth considering.
philsiarri 0 implied HN points 26 Dec 24
  1. OpenAI's new o3 AI model scored 85% on the ARC-AGI benchmark, which shows it can solve problems like a human. This score is higher than the last best AI score of 55%.
  2. The ARC-AGI test checks how well an AI can handle new challenges using little information, which is important for general intelligence. This breakthrough raises questions about how close AI is to being as smart as humans.
  3. Although the o3 model shows great promise, there are still doubts. Not enough details have been shared, and scientists want to test it more to see how well it can adapt in different situations.
What The Heck 0 implied HN points 15 Jan 25
  1. An algorithm can help guide LLM reasoning to generate correct answers more often. It uses a method similar to Monte Carlo Tree Search to improve outcomes.
  2. By sampling different reasoning steps and keeping track of which ones lead to correct answers, we can better inform the LLMs on how to approach problems.
  3. Having a feedback model to suggest better reasoning steps can enhance the overall performance of LLMs, making them more effective in generating accurate answers.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
domsteil 0 implied HN points 27 Jan 25
  1. Intelligence grows through a system of rewards and lessons learned over time. It’s not just about finding the one right answer but refining our understanding step by step.
  2. Using principles like blame and reward helps us learn better, whether it's cooking, driving lessons, or training AI. This process shows us how to improve and adapt in different situations.
  3. AI can become more flexible and powerful by training with specific tasks. By experimenting and learning from mistakes, we can develop smarter AI systems that can tackle a variety of tasks.
RSS DS+AI Section 0 implied HN points 18 Jan 25
  1. The next newsletter for AI and Data Science will come out in early February. It’s a good chance to stay updated.
  2. You can contribute to the newsletter if you have announcements, meetups, or jobs to share. Just reach out directly instead of replying to the email.
  3. Make sure to send your contributions to the specified email to ensure they are included.
Nano Thoughts 0 implied HN points 27 Jan 25
  1. AI can struggle with memorization instead of understanding, similar to how students might remember specific math problems without grasping the general concept. When AI memorizes examples too closely, it can't apply knowledge to new situations.
  2. Techniques like regularization help AI focus on important patterns rather than get lost in details. This is like training athletes under various conditions to build real skills instead of just practicing one way.
  3. Understanding how to forget unimportant information is crucial for both AI and human intelligence. The best learning doesn't come from remembering everything, but from knowing which patterns are worth keeping.
The Parlour 0 implied HN points 19 Feb 25
  1. Using data from US corporate bond holdings can help predict credit risk better than traditional ratings. It means more real-time information for making investment decisions.
  2. A new investment strategy called Betting Against Bad Beta is introduced. This strategy aims to improve how investors can bet against stocks with poor performance.
  3. Machine learning is becoming more important in finance, especially for analysis and predicting risks. This technology helps make smarter investment choices.
Gonzo ML 0 implied HN points 24 Feb 25
  1. Researchers successfully created AI agents that can simulate 1,052 real people with about 85% accuracy. This means the AI can closely mimic how real people would respond in various situations.
  2. The study highlights the importance of interviews over surveys, as they provide deeper insights into people’s behaviors and thoughts, allowing the AI to generate better follow-up questions and responses.
  3. These AI agents have potential uses in social science research. They could help predict public reactions to policy changes or simulate behavioral responses, leading to new methods of understanding human decision-making.
Gonzo ML 0 implied HN points 12 Feb 25
  1. A new model called s1-32B was created by using a small dataset of 1,000 question-answer pairs focused on reasoning. This cost about $25 to train, which is quite affordable.
  2. The method of controlling how much the model thinks during tests allows for better performance. They used a strategy called budget forcing to ensure the model generates the right amount of information.
  3. This approach showed that it's possible to achieve high-quality results with less data and resources, suggesting a promising path for future AI developments.
ppdispatch 0 implied HN points 06 Jun 25
  1. Reasoning Gym offers new ways to train models so they can get better at logic and math. It's like a gym for AI where they can practice and improve their skills.
  2. New techniques are helping us understand how large language models work in finance. This makes it easier to spot problems and ensure they follow rules.
  3. Research shows that language models like GPT memorize data before they start to understand it better. They can store a certain amount of information before they have to generalize.
RSS DS+AI Section 0 implied HN points 09 Jun 25
  1. There's an online talk about federated learning happening this Wednesday at 4 PM. It's a great chance to learn from experts in the field.
  2. The talk will explain how federated learning is different from traditional analysis. You'll find out what it means for the future of data science.
  3. Participants will also discuss the challenges of federated analytics and how it works today. It's a good opportunity to think about new possibilities in data analysis.
Expand Mapping with Mike Morrow 0 implied HN points 14 Jul 25
  1. You can choose how SQL query results are stored in Hex, either in memory or in the database. This affects how quickly you can run follow-up queries.
  2. There are two types of SQL commands in Hex: one that queries directly from the database and another that queries from a local in-memory dataframe. This choice can impact how your data is used.
  3. Hex allows you to chain SQL queries, which makes handling complex tasks easier. However, you need to be aware of where each query pulls data from to avoid surprises.
RSS DS+AI Section 0 implied HN points 01 Dec 25
  1. Data science and AI are constantly evolving, with new technologies and tools emerging regularly. Keeping up with these changes is important for anyone interested in the field.
  2. Ethics in AI is a major topic right now. It's essential to discuss bias, regulation, and the moral implications of using AI in our lives.
  3. There are many opportunities to get involved in data science communities, whether through volunteering or participating in discussions. Joining these groups can help shape the future of data science.
The Parlour 0 implied HN points 04 Dec 25
  1. Open-source satellite imagery can be used to create a global census of residential buildings to better measure climate risk and its impacts on housing and financial stability.
  2. Recent quantitative research is applying remote sensing and data-driven techniques to map built environments and inform climate and risk modeling.
  3. Full articles and curated analyses are often behind a subscription paywall, but short free trials can give temporary access to the full archives.
Brad DeLong's Grasping Reality 0 implied HN points 02 Jan 26
  1. The course is a quantitative, long-run global economic history class that teaches data-science literacy (including Python) to analyze population and income trends.
  2. Grades are intentionally generous but contingent on showing up, doing pre-class work, and participating—skip or zone out and you lose that privilege.
  3. Expect weekly short writing assignments, background readings, small data exercises, and optional Thursday Zoom sessions, with all logistics and materials posted on the course site.
The Healthtech Initiative 0 implied HN points 02 Mar 26
  1. Small, autonomous teams that own their entire stack unlocked velocity and scale, while splitting functions (like mobile and backend) slowed delivery.
  2. Only use AI when it truly outperforms simple rules—reserve models for cycle prediction, symptom analysis, personalization, and fine-tune on women’s health data to reduce bias and improve safety.
  3. Build the core competitive advantage (the health AI and data flywheel) and buy everything else, using wearable time-series models to proactively predict conditions and power growth.