The hottest Data Analysis Substack posts right now

And their main takeaways
Category
Top Technology Topics
Cremieux Recueil • 477 implied HN points • 25 Mar 26
  1. Researchers often use between-person comparisons that aren’t causally informative even when within-person or sibling designs are possible, so their estimates can be biased by unmeasured confounders.
  2. When you run within-family or within-person analyses, many headline associations (for example, claims that more social media use lowers cognition) disappear, suggesting those original results were artifacts of confounding.
  3. The field routinely skips basic robustness checks and measurement-invariance tests; empowering methodologists, providing better tools, and enforcing stricter editorial standards would greatly improve research reliability.
HackerNews blogs newsletter • 59 implied HN points • 02 Nov 24
  1. Measuring technical debt is crucial for leaders, especially CTOs. It helps in understanding and managing the challenges in software development.
  2. Freezing CEO salaries during layoffs can create a fairer work environment. It shows accountability and may protect jobs for regular employees.
  3. Life shouldn't solely be based on statistics. Everyone's experiences are unique and can't be fully represented by numbers.
Astral Codex Ten • 19959 implied HN points • 24 Feb 26
  1. There are two deceptive moves to watch for: using related-but-different facts to dismiss real complaints (the malicious streetlight effect) and overstating results to be “directionally correct” when the evidence doesn’t support it.
  2. Accurate counting matters — major crime has generally fallen, and explanations like reporting bias or better medical care don’t fully negate that trend, so it’s important to correct false claims about crime rates.
  3. Fixing misleading crime claims can feel like dismissing people’s everyday experiences of disorder, so it’s best to treat major crime statistics and local disorder (e.g., open-air drug markets, tent encampments) as separate issues and address each directly.
Steve Kirsch's newsletter • 4 implied HN points • 13 Mar 26
  1. A statistical analysis of several Australian regions found excess deaths began right after COVID vaccine rollouts, and the timing and age patterns are said to not match the official explanations.
  2. Analyses of other national records claim there was no clear mortality or hospitalization benefit from the vaccines, and frailty-matched comparisons reportedly show similar death rates for vaccinated and unvaccinated groups.
  3. Public health authorities and official reports largely avoided treating vaccines as a possible cause or quantifying lives saved or lost, while only a few officials publicly raised these concerns.
Knowingless • 1566 implied HN points • 12 Mar 26
  1. Scales are groups of survey items found with factor analysis that let you measure hidden traits efficiently, but they need lots of questions and many respondents to be reliable, and metrics like Cronbach’s alpha can be gamed by redundant items.
  2. Which items you include strongly shapes what factors you find, so a narrow or biased question set will miss whole traits; crowdsourcing a huge swath of questions can reveal unexpected dimensions but doesn’t eliminate sampling or submission bias.
  3. When you open up question-space widely, the biggest stable dimensions that tend to pop out are political left–right, belief/mysticism versus rationality, and a happy-versus-sad emotional axis, with many smaller subfactors depending on how finely you break the data.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
arg min • 436 implied HN points • 24 Oct 24
  1. Statistical tests are designed to help separate real signals from random noise. It's not just about understanding what they mean, but what they can do in practical situations.
  2. Many people misuse statistical tests, which can lead to misunderstandings about their purpose. Communities should establish clear guidelines on how to use these tests correctly.
  3. The main function of statistical tests is to regulate opinions and decisions in various fields like tech and medicine. They help ensure that important standards are met, rather than just preventing errors.
Construction Physics • 21504 implied HN points • 11 Dec 25
  1. Many countries, especially in Western Europe, have improved construction productivity over the years, but the US has seen a decline since the 1970s.
  2. Since the 1990s, some Eastern European and Latin American countries have shown productivity growth, but many wealthy countries, including those with advanced technologies like Japan and Sweden, have flat or declining productivity.
  3. Belgium stands out as a nation with consistent construction productivity growth, but it's unclear if this is due to real efficiency gains or just how the data is reported.
arg min • 734 implied HN points • 14 Oct 24
  1. Statistics should help us test claims by measuring how surprising the results are. However, there's doubt about whether our current statistical tests actually do this well.
  2. Randomized trials are important because they help us learn about treatments that may not always work. They focus on safety as much as they do on finding effective solutions.
  3. The field of statistics needs to be clear about its purpose. We should distinguish between using statistics for proving theories and for practical decision-making like quality control.
Conspirador Norteño • 28 implied HN points • 22 Mar 26
  1. Buying followers is common on TikTok, with accounts openly advertising follower sales and often showing thousands of suspicious followers.
  2. Fake follower networks show clear patterns — identical or machine-like usernames, few or no real posts, following many accounts but having few followers, and reused or AI-generated profile images — which make them relatively easy to spot.
  3. SMM panels sell massive follower packages and offer APIs to automate orders, so these fake networks can scale quickly; buying followers is a poor investment and just fuels the problem.
arg min • 634 implied HN points • 10 Oct 24
  1. Statistics often involves optimizing methods to get the best results. Many statistical techniques can actually be viewed as optimization problems.
  2. Choosing a statistical method isn't just about the math—it's also based on beliefs about reality. This philosophical side is important but often overlooked.
  3. There's a danger in relying too much on tools and models we can solve. Sometimes, we force the data to fit our preferred methods instead of being open to the actual complexities.
Handy AI • 19 implied HN points • 29 Oct 24
  1. ChatGPT performed better in analyzing a Spotify dataset, providing accurate insights without errors, and displaying clear visualizations.
  2. Claude encountered issues with text extraction and made mistakes in data interpretation, like incorrectly assigning genre labels where they didn't exist in the dataset.
  3. Overall, ChatGPT offered a smoother user experience, allowing users to follow along with the analysis while Claude's process was less straightforward.
Odds and Ends of History • 2010 implied HN points • 27 Jan 26
  1. Pre-sale ticketing at Vue across 878 screenings (70,765 seats) shows just 1,160 bookings, roughly 1.6% of available seats and about 1.8% filled per screening.
  2. Most bookings are concentrated in the opening weekend with sales trailing off sharply after, indicating limited broader interest.
  3. Some seat markings may be system quirks or reserved wheelchair seats so the true number sold could be even lower, and overall the film looks unlikely to be a UK box-office hit.
Don't Worry About the Vase • 4211 implied HN points • 24 Nov 25
  1. Gemini 3 Pro is really smart and performs well in many tasks, especially when you want accurate answers. It's great for creative writing and technical tasks.
  2. However, it often makes up answers instead of admitting it doesn't know something. This can lead to confusion and mistakes.
  3. While it's fast and efficient in many respects, it sometimes lacks depth and may over-simplify complex problems, making its outputs less trustworthy.
beyondrevenueoperations • 19 implied HN points • 27 Oct 24
  1. Combining SQL and Python makes data management much easier. SQL helps you access and pull data, while Python helps analyze it and create reports.
  2. Using SQL, you can break down data silos from different systems to get a complete view of your customers and performance. This is crucial for making smart, data-driven decisions.
  3. With Python, you can automate tasks, build predictive models, and visualize data, which saves time and enhances your ability to understand trends and insights.
ChinaTalk • 770 implied HN points • 26 Jan 26
  1. Claude Code is excellent at writing code and analyzing clean, structured data, so tasks like scraping, sentiment analysis, and extracting insights become fast and practical. It produces usable results and handles internet slang and comment-level nuance well.
  2. When left to search the web on its own, it leans on the most accessible sources and can cite unreliable outlets or make factual mistakes, especially when paywalled reputable sources are unavailable. It needs explicit instructions on where to look and close supervision to ensure source quality.
  3. The tool is popular with developers and non-technical users who value its productivity, but access barriers and subscription costs limit broader use. Effective results require careful prompting, oversight, and feeding it original or vetted data.
Marcus on AI • 12370 implied HN points • 10 Jul 25
  1. A new study shows that AI coding tools might actually slow down experienced developers instead of speeding them up. They thought these tools would make them faster, but the reality was quite the opposite.
  2. Developers expected a 24% increase in their speed with AI tools, but found they were 19% slower than before. This is surprising and suggests that the benefits of using AI for coding may not be as great as believed.
  3. The study focused on experienced developers with complex projects, so AI tools could still be helpful for beginners or simpler tasks. Time will tell if this trend changes in the future.
After Babel • 448 implied HN points • 05 Feb 26
  1. A free, research-informed toolkit gives schools ready-made surveys and measures to track how phone policies affect students, teachers, administrators, and parents.
  2. It works for both single-school evaluations and large, rigorous studies—Qualtrics formats and optional collaboration with the Stanford Social Media Lab support longitudinal tracking and advanced analysis.
  3. The toolkit adds practical analysis help (a manual scoring guide, a customizable survey builder, and a coming Data Dashboard), but it doesn’t by itself establish definitive causality without stronger study designs.
Marcus on AI • 7825 implied HN points • 09 Jul 25
  1. Generative AI has shown some progress in handling specific prompts, which is a win for some, but it doesn't mean it has mastered complex tasks like compositionality. Success on easy tasks doesn't prove overall ability.
  2. There are still many cases where AI fails at tasks that involve understanding parts and wholes, suggesting that its understanding is not as robust as claimed.
  3. Judging the AI's overall capabilities based on a few successes can be misleading; it's important to look at a broader range of performance to get a realistic picture.
Silver Bulletin • 379 implied HN points • 04 Feb 26
  1. Democrats hold a modest lead of about D +5.5 on the generic congressional ballot, up from roughly D +3 between June and November.
  2. Individual polls vary a lot — results this week ranged from about D +1 to D +9 — but the average smooths those swings and weights polls by pollster quality, sample size, recency, and frequency while preferring likely-voter samples.
  3. Many of the polls in the average were conducted before the Jan. 24 killing of Alex Pretti, so subsequent public reaction could push the generic ballot further toward Democrats, and paid subscribers can access state benchmarks and historical generic-ballot averages back to 1994.
Don't Worry About the Vase • 1612 implied HN points • 20 Nov 25
  1. AI models can be categorized into tools, minds, and weapons. Tools help us accomplish tasks, minds interact with us more meaningfully, and weapons can manipulate and direct our actions.
  2. As AI technology evolves, companies are racing to create and enhance models, but regulations are becoming crucial to ensure safety and prevent misuse, especially given the growing concerns about AI's impact on society.
  3. The competition between the US and China in AI development highlights differing approaches, with the US focusing on leading advancements while China is leveraging open-source models to catch up quickly.
Marcus on AI • 13161 implied HN points • 04 Feb 25
  1. ChatGPT still has major reliability issues, often providing incomplete or incorrect information, like missing U.S. states in tables.
  2. Despite being advanced, AI can still make basic mistakes, such as counting vowels incorrectly or misunderstanding simple tasks.
  3. Many claims about rapid progress in AI may be overstated, as even simple functions like creating tables can lead to errors.
Ground Truths • 15921 implied HN points • 14 Dec 24
  1. Your individual lab results, like the Complete Blood Count (CBC), can vary a lot between people but stay stable for you over time. This means your personal health data can give more accurate insights than just average values used for everyone.
  2. Personalized reference values from CBC tests can help predict health risks better than conventional methods. They show clearer connections to potential diseases and can indicate specific health issues.
  3. Using advanced technology like AI to analyze these personal health metrics could help doctors spot risks early. This approach can enhance patient care by identifying high-risk individuals for proactive health management.
Cremieux Recueil • 211 implied HN points • 11 Feb 26
  1. Longstanding score gaps between well‑identified demographic groups remain essentially unchanged and are at levels seen for decades.
  2. Most racial/ethnic groups show similar score variability, but Asian students have much higher variance, possibly because the category is more diverse or because high performers are more spread out.
  3. Male scores are slightly higher and more variable at the national level, but that male advantage disappears in Michigan — where all students take the SAT — highlighting that selective test participation shapes national patterns.
RESCUE with Michael Capuzzo • 9787 implied HN points • 08 Jun 23
  1. John Berndsen's heart complications after receiving the Pfizer vaccine illustrate a potential link to myocarditis and the importance of questioning vaccine safety.
  2. Many adverse reactions to COVID-19 vaccines are not being reported in the media, and the numbers show a significant impact on health, including deaths.
  3. John Berndsen's experience highlights the importance of critically examining the safety and necessity of additional vaccine doses, especially for vulnerable individuals.
Astral Codex Ten • 4817 implied HN points • 02 Jul 25
  1. AI can be really useful for research, especially in complex topics like genetics. It helps to gather and analyze a lot of information quickly.
  2. However, we need to be careful because AI can also provide misleading information. It's important to cross-check facts and not trust everything it says.
  3. Balancing the benefits and risks of AI is key. We should use its tools but also stay critical of the results it produces.
Tim Culpan’s Position • 119 implied HN points • 05 Sep 24
  1. TSMC and Intel are two major players in the semiconductor industry. Their performance and strategies have crucial implications for technology.
  2. Visual data can highlight important differences in the technical and financial health of these companies. Charts can make complex information easier to understand.
  3. Recent reports show that Intel is facing significant challenges, while TSMC continues to lead in production and technology advancements. This could shape the future of the tech industry.
Sustainability by numbers • 615 implied HN points • 22 Dec 25
  1. The newsletter will broaden its focus beyond environmental topics to include demographics, technology, global health, and development while keeping a data-led approach to analyze problems and solutions.
  2. The newsletter is being renamed to "By the Numbers" to reflect the wider scope, and the change will happen automatically; some subscribers may leave, but the aim is to reach a broader set of global issues.
  3. The publication will remain free and unpaid, produced in spare time to keep it enjoyable, with plans to continue publishing data-driven posts into 2026.
Encyclopedia Autonomica • 39 implied HN points • 13 Oct 24
  1. Transformers use a specific structure for commands called JSON. This makes it easier to describe actions clearly and effectively.
  2. The system prompt includes rules that the agent must follow, like focusing on one action at a time and using the correct values for inputs.
  3. The design also emphasizes iterative reasoning, where the agent can build on previous observations to make better decisions in tasks.
Uncharted Territories • 5149 implied HN points • 28 Feb 23
  1. The debate around mask efficacy is contentious and the science is complex.
  2. Properly worn masks can reduce infection rates, especially when used in community settings.
  3. Some studies in the meta-analysis may have been weighted inaccurately, resulting in misleading conclusions.
Public Universal Friend • 79 implied HN points • 02 Sep 24
  1. Using a customer engagement platform like Customer.io can help marketers improve their targeting and maximize growth. It offers better data management and less need for technical support.
  2. Spring is a great time for businesses to focus on improving conversions through digital marketing strategies. Real-time data can help companies get more return on their investment.
  3. Personal connections and genuine interactions are valuable, even in business communication. Taking the time to show real interest can make a difference.
Don't Worry About the Vase • 2553 implied HN points • 24 Jun 25
  1. Critiques are important for improving forecasts. It's good to get feedback and adjust predictions based on detailed analysis.
  2. Modeling progress in AI is tricky and uncertain. It's not easy to predict how quickly AI will advance, and different methods can give very different results.
  3. Forecasts should be communicated clearly, without overly negative language. Clear messaging helps everyone understand the importance and limitations of the predictions.
Platformer • 2476 implied HN points • 10 Jan 24
  1. Meta announced new measures to protect users under 18 from harmful content on its platforms.
  2. There is a growing focus on child safety in social media regulations, shifting from speech-related issues.
  3. Lawmakers and social networks need to find common ground to make real progress in improving teen mental health.
Independent SAGE continues • 1418 implied HN points • 20 Mar 24
  1. Independent SAGE has launched a Substack to share insights about Covid research and data. They aim to provide valuable information directly from experts to the public.
  2. They plan to post updates roughly every two weeks, including responses to important new research and news. This helps keep everyone informed about the ongoing situation.
  3. The Substack will remain free for subscribers, encouraging more people to stay updated on Covid developments and public health measures.
The Product Channel By Sid Saladi • 13 implied HN points • 11 Mar 26
  1. Manus is an autonomous AI agent that plans, executes, and delivers multi-step workflows so you can give a goal, walk away, and get a finished deliverable.
  2. It combines a cloud virtual computer, a local Browser Operator, and built-in tools like slides, design, website builder, data analysis, and scheduled tasks to handle research, development, and content end-to-end.
  3. Reusable Skills plus Connectors let you package procedures and link your apps to automate recurring work and share workflows across projects and teams, with different plans and credit tiers for more power.
Marcus on AI • 4228 implied HN points • 27 Jan 25
  1. Nvidia's stock might be facing a big drop, which is a concern for investors. A decline over 10% indicates that something is going on in the market.
  2. The market can behave in unpredictable ways, and this uncertainty can be tough for investors to manage. Today might be a key moment in the stock market.
  3. Overall, the economics of generative AI can lead to unexpected changes, making it a wild area to watch for investors and tech enthusiasts.
SeattleDataGuy’s Newsletter • 353 implied HN points • 28 Nov 25
  1. Excel remains a key tool for many teams, despite the availability of advanced data platforms. It's easy to use and allows quick edits without messing with permanent data sources.
  2. When teams prefer Excel over dashboards, it usually signals a deeper issue, like dashboards not meeting their needs or users needing more flexibility.
  3. Instead of trying to eliminate Excel, it's more effective to incorporate it into data strategies, allowing users to access and manipulate data in familiar ways.
Don't Worry About the Vase • 1881 implied HN points • 17 Jun 25
  1. o3-pro can handle bigger problems but can be slow, which might disrupt your workflow. It’s often better to queue up questions for later use instead of waiting for immediate answers.
  2. Many users see o3-pro as slightly better than o3 but still not perfect, especially in areas like coding where its performance can be inconsistent. It works well for in-depth analysis, but may not be the best for all tasks.
  3. The significant price drop for o3 makes it a more appealing choice for general use compared to o3-pro, which is seen as special-case only. This change could lead to more ambitious AI projects with the same budget.