The hottest Data science Substack posts right now

And their main takeaways
Category
Top Technology Topics
Rémi Ounadjela 6 implied HN points 06 Aug 25
  1. AI can be useful in many areas of a company, but it's important to choose the right tools carefully. Think about the problems you want to solve first.
  2. There are different levels of AI tools, ranging from basic productivity helpers to complex systems that can perform tasks on their own. Each level comes with its own benefits and risks.
  3. As you use more advanced AI tools, remember that higher risks come with higher rewards. Make sure to set up good guardrails and track how well things are working.
The Future Does Not Fit In The Containers Of The Past 20 implied HN points 15 Dec 24
  1. Data is important, but focusing too much on it can harm the long-term success of both businesses and people. It's crucial to balance numbers with human emotions and culture.
  2. Leaders should encourage open discussions about tough topics and avoid wasting time in unnecessary meetings. This helps create a culture where everyone feels comfortable sharing their thoughts.
  3. Successful companies need to remember that their employees are not just numbers. Investing in their development and well-being leads to a more motivated and productive workforce.
LatchBio 15 implied HN points 27 Feb 25
  1. Spatial RNA technology helps us see how cells interact in their natural environment. It gives a clearer picture than traditional methods that just show gene activity without their locations.
  2. There are many ways to capture and analyze spatial gene data, like using specially barcoded slides or microfluidic methods. Each approach has its pros and cons depending on what researchers want to study.
  3. Advancements in technology are making it possible to analyze tiny details, like individual cells or even parts of cells. This opens new doors for understanding biology and diseases.
Jay's Data Stream 23 implied HN points 30 Oct 24
  1. The concert ticket market is built on false pricing, where tickets are sold for lower than their actual value. This means people often pay much more on resale markets.
  2. Making money by reselling tickets is much harder than it seems. Success requires understanding a lot about the market and using technology to navigate tough ticketing systems.
  3. Creating a startup in this space is complicated and needs more than just good ideas. It's about having the right infrastructure to turn those ideas into profitable actions.
RSS DS+AI Section 53 implied HN points 31 Dec 23
  1. The focus for the year was 'Effective and Efficient Data Science' to highlight the critical aspects of the field beyond hype.
  2. Various events and discussions were held throughout the year to promote best practices in Data Science.
  3. Engagement with the community through events, surveys, and articles was emphasized to ensure diverse voices are heard in influencing policy.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Gradient Flow 59 implied HN points 27 Jan 22
  1. The role of 'machine learning engineer' has emerged as a key position for implementing data science in production, bridging the gap between data products and machine learning models.
  2. Geographically, machine learning engineers are distributed across various regions, with companies and industries in different locations employing them.
  3. Advances in computer hardware design, coupled with improvements in models and algorithms, are expected to significantly enhance model training efficiency.
RSS DS+AI Section 17 implied HN points 01 Jan 25
  1. Data science and AI are rapidly evolving fields, with 2024 being a particularly exciting year for advancements. As we move into 2025, the trends and stories from last year will continue to shape the future.
  2. Ethics in AI is a crucial topic that remains relevant, especially around issues like bias and safety. The way AI is developed and used needs careful consideration to align with human interests.
  3. There are many practical applications and resources available for learning about data science and AI. From tutorials to real-world examples, there are plenty of opportunities to get involved and apply AI technologies.
The Palindrome 1 implied HN point 23 Dec 25
  1. The most-read posts emphasize math and foundational CS for machine learning, covering topics like a mathematics roadmap, algorithmic analysis, graph theory, and practical skills such as coding on paper and representing graphs.
  2. A holiday promotion offers a 30% lifetime discount on the annual paid subscription, which unlocks paid-only content and helps fund more math and machine learning material for the community.
  3. Subscriber-count milestones will unlock community perks (mini-courses, a dedicated Manim animator, and a full-time writer), and the publication invites feedback while planning to expand and reinvest in 2026.
Decoding Coding 19 implied HN points 30 Mar 23
  1. Zero-shot prompting lets a model answer questions without examples. It's useful when there's no data to guide the model.
  2. Few-shot prompting gives the model a few examples to improve its answers. This helps the model understand the context better.
  3. Chain-of-thought prompting breaks down complex problems into steps. It helps the model reason through tasks more effectively.
The Counterfactual 1 HN point 08 Jul 24
  1. Mechanistic interpretability helps us understand how large language models (LLMs) like ChatGPT work, breaking down their 'black box' nature. This understanding is important because we need to predict and control their behavior.
  2. Different research methods, like classifier probes and activation patching, are used to explore how components in LLMs contribute to their predictions. These techniques help researchers pinpoint which parts of the model are responsible for specific tasks.
  3. There's a growing interest in this field, as researchers believe that knowing more about LLMs can lead to safer and more effective AI systems. Understanding how they work can help prevent issues like bias and deception.
Sector 6 | The Newsletter of AIM 19 implied HN points 04 Apr 23
  1. Hugging Face recently launched Vicuna-13B, a new model based on Meta's LLaMA. It was created at a very low cost compared to similar models.
  2. Stanford University's Alpaca was another recent launch based on LLaMA, also developed affordably. It shows that advanced AI can be accessible to more people now.
  3. The new chatbot using Vicuna-13B is performing really well, matching ChatGPT and Bard in quality. It's also beating many other models in most tests, showing its high capability.
serious web3 analysis 26 implied HN points 15 Aug 24
  1. FetchFox is an AI-powered Chrome extension that makes web scraping easy for everyone, even if you can't code. Just a few clicks allow you to gather useful data from any website.
  2. Traditional web scraping requires programming skills and can be time-consuming. FetchFox simplifies the process, letting anyone scrape data in minutes rather than hours.
  3. FetchFox is designed to work like a human visitor, which helps it avoid being blocked by websites. This means it can extract data more effectively than traditional methods.
Year 2049 15 implied HN points 16 Jan 25
  1. AI comes in different types, and it's good to know what they are. Understanding the types helps us see how AI works in our daily lives.
  2. Machines learn to become intelligent over time, which is fascinating. This process is important to understand how AI evolves.
  3. It's helpful to share knowledge about AI with others. Teaching friends and family can make everyone more aware of how AI impacts us.
Vesuvius Challenge 14 implied HN points 23 Jan 25
  1. Community members contributed a lot to the Vesuvius Challenge, earning prizes for their work. This shows how teamwork can lead to great progress!
  2. Some projects focused on improving how we visualize 3D scrolls and extracting data from images. These tools could really help researchers understand ancient texts better.
  3. Awards are given for various types of contributions, encouraging creativity and technical skills. It’s exciting to see different approaches being recognized in the community.
Data Science Weekly Newsletter 19 implied HN points 04 May 23
  1. There's a Slack group for those who subscribe to Data Science Weekly. It's a great place to connect and learn together.
  2. The invite link for the Slack group is exclusive to paid subscribers, so make sure to keep it private.
  3. The group aims to help members interact, learn, and support each other in the field of data science.
Nick’s Substack 1 HN point 03 Jul 24
  1. Sparse autoencoders are tools that help us understand how language models work by breaking down their process into simpler parts. They help identify important features in the model that contribute to its outputs.
  2. The idea of sparsity means only a few features are needed to describe something, while superposition lets a lot of different features exist in a small space. This makes learning and processing more efficient for the model.
  3. Using sparse autoencoders opens up new ways to interact with language models. Instead of just inputting text and getting answers, we can manipulate features and explore the model's internal workings more creatively.
Sector 6 | The Newsletter of AIM 19 implied HN points 01 Mar 23
  1. ChatGPT has performed well in various exams, including MBA and medical tests, showing that it can answer many questions correctly.
  2. However, when tested on the UPSC Prelims, ChatGPT only answered 54 out of 100 questions correctly, demonstrating its limitations.
  3. This highlights that while AI can be smart, it might still struggle with complex and diverse challenges like tough civil service exams.
Sector 6 | The Newsletter of AIM 19 implied HN points 21 Feb 23
  1. Indian IT companies failed to automate their operations before the pandemic, but now they have a new chance with advanced AI tools. This could help them become more innovative and efficient.
  2. The introduction of large language models, like ChatGPT, could improve how IT companies operate and serve their customers. There's a lot of potential for better efficiency.
  3. Experts believe that using AI in IT could change many processes for the better, making companies more focused on customer needs and improving their overall performance.
Sector 6 | The Newsletter of AIM 39 implied HN points 06 Jun 22
  1. Edtech companies like BYJU'S and upGrad are buying smaller firms to strengthen their position in data science education. This shows a trend of growth and consolidation in the industry.
  2. Traditional training institutions like NIIT and Aptech are struggling to keep up with these changes. They seem to be losing relevance in the fast-paced education market.
  3. BYJU'S made a big impact last year by acquiring ten companies for $2.5 billion. This highlights the scale of investment happening in the education sector, particularly in data science.
do clouds feel vertigo? 19 implied HN points 20 Mar 23
  1. AI training costs are dropping significantly, which makes it easier for more people to create their own AI models.
  2. AI models can become more common and even borrowed from others, which leads to questions about ownership and competition.
  3. Companies now face a choice between buying AI capabilities or building their own, affecting how they manage privacy and efficiency.
Year 2049 13 implied HN points 17 Jan 25
  1. AI systems learn from data, so the quality of that data is really important. Better data means smarter machines.
  2. Machines can become biased if they are trained on biased data. It's important to watch out for this when developing AI.
  3. This is just one part of a series explaining AI. More episodes will cover different aspects of how machines learn and behave.
The Product Channel By Sid Saladi 16 implied HN points 17 Nov 24
  1. Large language models (LLMs) are special AI systems that understand and generate human language. They can do things like summarize texts, translate languages, and even write codes.
  2. LLMs are changing many industries by powering chatbots, helping create content, and giving personalized product recommendations. This makes services smarter and more helpful.
  3. Building custom LLMs requires a lot of money and data. Companies must invest millions and gather vast amounts of information to develop effective models.
The Palindrome 5 implied HN points 05 Jul 25
  1. There are many ways to get into machine learning. You don't need to follow strict rules or have a specific background.
  2. You can start with just basic math skills. High school math is enough to begin your journey in machine learning.
  3. Whether you want to be a generalist or a specialist in machine learning, both paths are valid. Choose what fits your goals best.
The Product Channel By Sid Saladi 16 implied HN points 10 Nov 24
  1. AI is changing how products are made and used. Product managers need to understand AI to stay ahead in their industry.
  2. There are many AI applications, like chatbots and recommendation systems, that can improve user experience. Learning about these tools can help product managers create better products.
  3. While AI has benefits, it also brings risks like bias and job losses. It's important for product managers to think about these issues and apply AI responsibly.
Am I Stronger Yet? 15 implied HN points 12 Nov 24
  1. AI is making rapid progress, but it is not close to achieving artificial general intelligence (AGI). Many tasks still require human capabilities, showing that there is still a long way to go.
  2. Current AIs excel at specific tasks but struggle with complex, nuanced tasks that require extensive context or emotional intelligence, like managing a classroom or writing a novel.
  3. While there are exciting advancements happening with AI, the journey towards true intelligence is more like crossing a vast ocean than a quick sprint, suggesting that there are many challenges ahead.
AI Brews 12 implied HN points 10 Jan 25
  1. Stability AI has released a new tool called Stable Point Aware 3D, which lets you edit 3D objects from just one image really quickly. It's free to use for everyone.
  2. Microsoft has made its Phi-4 model open-source and introduced rStar-Math, a new technique that improves math solving in smaller language models.
  3. Qwen Chat is a new web app allowing users to interact with various Qwen models, making it easy to compare their capabilities all in one place.
Steve Kirsch's newsletter 6 implied HN points 18 May 25
  1. The KCOR method is a new, simple technique to analyze how different interventions, like vaccines, affect outcomes such as mortality. It uses basic data like date of birth, date of death, and vaccination date to provide clear results.
  2. The analysis suggests that COVID vaccines may have increased mortality rates, indicating the vaccines could be more harmful than helpful. This counters many previous claims about the vaccines saving lives.
  3. KCOR is designed to be objective and straightforward, allowing for accurate comparisons without needing complex data adjustments, making it a powerful tool for understanding health interventions.
Never Met a Science 55 implied HN points 31 May 23
  1. TikTok's algorithm shapes content creators' behavior based on feedback and viral success.
  2. The algorithm aims to keep both creators and consumers engaged, but risks leading to repetitive content.
  3. Data science and algorithms in platforms like TikTok create simplified simulations of reality for optimization, focusing on subjective metrics.
Jake Ward's Blog 2 HN points 30 Apr 24
  1. Large language models like ChatGPT have complex, learned logic that is difficult to interpret due to 'superposition' - where single neurons correspond to multiple functions.
  2. Techniques like sparse dictionary learning can decompose artificial neurons into 'features' that exhibit 'monosemanticity', making the models more interpretable.
  3. Reproducing research on model interpretability shows promise for breakthroughs and indicates a shift towards engineering challenges over scientific barriers.
RSS DS+AI Section 35 implied HN points 02 Jan 24
  1. Continuing work on expanding accreditation for data science professionals
  2. Hot topics include bias, ethics, and regulation in data science and AI
  3. Exciting developments in research, generative AI, and real world applications
AI Brews 15 implied HN points 08 Nov 24
  1. Tencent has released Hunyuan-Large, a powerful AI model with lots of parameters that can outperform some existing models. It's good news for open-source projects in AI.
  2. Decart and Etched introduced Oasis, a unique AI that can generate open-world games in real-time. It uses keyboard and mouse inputs instead of just text to create gameplay.
  3. Microsoft's Magentic-One is a new system that helps solve complex tasks online. It's aimed at improving how we manage jobs across different domains.
Intuitive AI 19 implied HN points 22 Aug 24
  1. Tech companies are paying a lot for training data because it helps them improve their AI models. As AI use grows, high-quality data has become very valuable.
  2. Having diverse and rich training data is crucial for AI to learn well. Just like a student needs various books to understand different subjects, AI needs various data to perform better.
  3. Quality of the data matters even more than quantity. Rich, informative data leads to better AI outcomes, which is why companies are willing to spend big bucks on it.
Arkid’s Newsletter 17 HN points 30 Sep 24
  1. AI and machine learning are creating a lot of hype, but it's important to separate the noise from the real value. Just like in the dot-com boom, there will be winners, but it won't be easy to find them.
  2. Many companies are wasting money on consultants who offer little help without delivering real results. To succeed in AI, businesses need to focus on building intelligent products that can learn and iterate based on user feedback.
  3. There's concern about AI taking over jobs in software and machine learning, but skilled professionals will still be needed. It’s crucial for entry-level workers to build solid expertise in their field and adapt to new developments in AI.
HackerPulse Dispatch 5 implied HN points 20 Jun 25
  1. Language models can now learn on their own by creating their own training data, which means they get better without needing human help.
  2. There are new benchmarks to measure how well models understand music, making it easier to compare their performance on different tasks.
  3. A new method allows for better code translation between different programming languages, outpacing older systems in speed and accuracy.
Counting Stuff 54 implied HN points 02 May 23
  1. Teams are often created to fill niche use cases, leading to specialized roles and organizational politics.
  2. Being type-cast into a specific role can limit opportunities for growth and variety in work tasks.
  3. To break out of being type-cast, showcase your ability to do different kinds of work and actively seek out diverse opportunities.