The hottest Statistics Substack posts right now

And their main takeaways
Category
Top Technology Topics
Daily Chartbook β€’ 1467 implied HN points β€’ 01 Sep 23
  1. Median home sale price increased by 4.8% in the four weeks ending August 27, the biggest jump since October.
  2. Active listings saw an 18.7% drop from a year earlier, the largest decline since February 2022.
  3. Employers cut 75k jobs in August, marking a 267% increase from a year ago.
The Better Letter β€’ 196 implied HN points β€’ 08 Dec 23
  1. Baseball's analytics revolution owes its existence to a smart security guard creating statistical analysis accessible and interesting.
  2. The success of 'Moneyball' accelerated the statistical disruption in baseball and led to the widespread use of advanced statistical measures in MLB.
  3. The Bill James approach transformed baseball analysis to be more objective, relevant, and useful, impacting team strategies and decision-making.
Cremieux Recueil β€’ 392 implied HN points β€’ 18 Dec 24
  1. Senator Chris Murphy made strong claims about healthcare causing deaths in the U.S. but lacked accurate data to back them up. It's important for public officials to use correct statistics when discussing serious issues.
  2. Many deaths in America are unrelated to insurance denials, especially for people over age 65 who are mostly covered by Medicare. This shows that the healthcare system isn't as profit-driven in these cases as Senator Murphy suggested.
  3. Studies have shown that expanding access to healthcare has only small effects on overall mortality. Claims about thousands of deaths caused by lack of care might be greatly exaggerated.
Daily Chartbook β€’ 1388 implied HN points β€’ 25 Aug 23
  1. Affordability for homebuyers has decreased significantly since August 2022.
  2. Homebuilder backlogs have decreased from 10.7k homes to 7.3k homes from Q2 2022 to Q2 2023.
  3. Household deposit balances are at least 30% higher in July 2023 compared to 2019.
Of Boys and Men β€’ 495 implied HN points β€’ 10 Oct 24
  1. Many reports on suicide focus too much on girls, giving the impression that they are at a higher risk, which is misleading. In fact, most suicides among teenagers involve boys.
  2. The media often discusses the feelings of sadness and suicidal thoughts in girls but fails to provide clear statistics on the actual suicide rates by gender. This can create confusion about who is really most at risk.
  3. It's essential to acknowledge the growing suicide crisis among young men and include accurate data in discussions to better address mental health issues for everyone. We need to talk about both boys and girls honestly.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Mindful Modeler β€’ 279 implied HN points β€’ 23 May 23
  1. Leo Breiman emphasized the importance of both data modeling culture and algorithmic modeling culture in statistical modeling.
  2. Breiman advocated for being problem-focused over solution-focused, encouraging modelers to choose the appropriate mindset based on the task at hand.
  3. Understanding various modeling mindsets, such as statistical inference and machine learning, is crucial for effective modeling.
coldhealing β€’ 235 implied HN points β€’ 13 Mar 23
  1. The Dark Statistician Guide recommends using contrarian and statistical strategies in office March Madness pools.
  2. Focus on maximizing your odds of winning rather than college basketball knowledge when making your bracket.
  3. Consider decentralizing your picks, betting against popular choices, and adjusting strategies based on pool size for better chances of success.
Scott's Substack β€’ 117 implied HN points β€’ 31 Jan 24
  1. No anticipation means the baseline period is equal to Y(0) not Y(1)
  2. Difference-in-differences coefficient equals ATT in the post period for the treatment group plus parallel trends bias minus ATT in the incorrectly specified baseline period
  3. Difference-in-differences always requires three assumptions to point identify the ATT: SUTVA, Parallel trends, and No Anticipation
Heterodox STEM β€’ 113 implied HN points β€’ 09 Jul 25
  1. Many top professional basketball and football players in the U.S. are Black, which shows a shift away from racial discrimination in these sports. This situation raises questions about claims that no group has inherent advantages without discrimination.
  2. There are noticeable performance differences in sports between Black athletes and Asian athletes, with statistical advantages for Black athletes. This suggests that athletic success can come from a mix of natural talent and environmental support.
  3. The significant gaps in performance statistics across different racial groups show that not all disparities are due to discrimination. These differences can impact educational and career opportunities, like in STEM fields, leading to discussions about unfair practices like limiting Asian admissions at some colleges.
House of Strauss β€’ 22 implied HN points β€’ 12 Dec 25
  1. Interceptions get blown up by social media and highlight culture, so mistakes feel much bigger now and push players and teams toward avoiding visible errors.
  2. Modern efficiency stats (like passer ratings and QBR) overweight completions and punish interceptions, which incentivizes safer, shorter throws and can reduce overall offensive production.
  3. Offenses should balance efficiency with productivity by accepting some riskβ€”more air yards, deeper targets, and occasional interceptions can lead to more yards and points than a purely conservative approach.
Points And Figures β€’ 719 implied HN points β€’ 11 Mar 24
  1. Government-reported economic numbers can be misleading, especially in non-democratic countries where they may be fake.
  2. Statistic revisions are common in economic releases, but major revisions like a 35% drop raise concerns about accuracy.
  3. Unemployment numbers from the US Department of Labor under President Biden have seen significant and questionable revisions, impacting predictions and planning based on them.
Logging the World β€’ 199 implied HN points β€’ 28 Sep 23
  1. The book 'Four Ways of Thinking' by David Sumpter discusses four philosophies that map onto the four types of cellular automata identified by Stephen Wolfram, with historical anecdotes and life lessons.
  2. The book explores statistical, interactive, chaotic, and complex ways of thinking, connecting topics like cellular automata, chaos theory, and modern statistics with practical applications.
  3. David Sumpter's book introduces the complexity of modern mathematical research, showcasing the emergence of complicated behavior from simple rules and the fascinating concept of quantifying complexity in patterns.
A Biologist's Guide to Life β€’ 15 implied HN points β€’ 27 Dec 25
  1. Ecological patterns depend on the spatial, temporal, and evolutionary scale you examine; changing the scale can reveal or hide important patterns.
  2. Phylofactorization is an algorithm that finds edges or clades in a phylogenetic tree that best explain differences in traits or ecological patterns, letting you partition life at the scales that matter for a given question.
  3. There is no single correct species or taxonomic scale; instead choose or infer the lineage-level scales that match your question, and tree-based partitioning can also reveal relevant scales in non-biological hierarchical systems.
Mindful Modeler β€’ 379 implied HN points β€’ 27 Dec 22
  1. Conformal prediction for classification works by ordering predictions from certain to uncertain, dividing them based on a user-defined confidence level.
  2. Conformal prediction consists of three main steps: training, calibration, and prediction, following a similar recipe across different algorithms.
  3. Different resampling strategies like k-fold cross-splitting and jackknife are used in conformal prediction, offering a balance between computation cost and prediction accuracy.
Silver Bulletin β€’ 232 implied HN points β€’ 06 Jan 25
  1. The Hall of Fame should consider many factors, not just one statistic like Wins Above Replacement (WAR). This means looking at achievements, player talent, and character too.
  2. Players might have high WAR scores but lack the greatness often associated with Hall of Fame status. For example, a consistent but average player shouldn't necessarily be in the Hall over a standout who had fewer career years.
  3. Voters for the Hall of Fame are required to consider a player's overall impact, including postseason performances and fan appeal. This makes it a more complex decision than just focusing on statistics.
The Better Letter β€’ 157 implied HN points β€’ 20 Oct 23
  1. Baseball analytics have revolutionized the sport, but interpreting data on human behavior is complex.
  2. Clutch hitting in baseball is a controversial topic with no solid evidence of its existence as a repeatable skill.
  3. Combining traditional scouting with statistical analysis in sports management is often more effective than choosing one over the other.
Technology Made Simple β€’ 159 implied HN points β€’ 23 May 23
  1. The Normal Distribution is a probability distribution used to model real-world data, with a bell-shaped curve and key points located at the center.
  2. The Normal Distribution is essential as it is commonly used in various fields to model real-world phenomena, calculate probabilities, and make informed decisions in software development.
  3. Understanding and using the Normal Distribution in software can help in making approximations for performance, making the right sacrifices, and optimizing solutions based on real-world data.
Outlandish Claims β€’ 19 implied HN points β€’ 12 Jun 24
  1. Berkson's Paradox applies to various situations where multiple factors influence outcomes, leading to counterintuitive results.
  2. Applying Berkson's Paradox to different scenarios can reveal hidden correlations and insights, such as in medical studies, card games, or economic policies.
  3. The essence of Berkson's Paradox lies in understanding that when focusing on a specific subcategory, the causes of membership in that category can be more negatively correlated than in the broader category.
Technology Made Simple β€’ 139 implied HN points β€’ 25 Apr 23
  1. Statistics can be misleading if affected by bias, which is a flaw in experiment design or data collection process.
  2. Biases affect everyone and can be exploited by manipulative individuals like politicians and salespeople.
  3. Common statistical biases include selection bias, recall bias, and observer bias, which can all be combated by slowing down and evaluating claims carefully.
The Dossier β€’ 490 implied HN points β€’ 06 Mar 24
  1. 40 Covid vaccine candidates worldwide were claimed to be highly effective, but none of them actually worked.
  2. Pharmaceutical companies and governments globally falsely advertised Covid vaccines as the ultimate protection.
  3. The Covid-19 vaccine situation highlights the importance of scrutinizing statistics and not letting a crisis be exploited.
Silver Bulletin β€’ 679 implied HN points β€’ 01 Oct 23
  1. State partisanship and COVID vaccination rates strongly predict COVID death rates even after considering age.
  2. Simplicity in statistical analysis can help in avoiding overfitting models and focusing on robust, true facts.
  3. Vaccination rates are more predictive of COVID death rates than state partisanship once age is controlled for.
Trench Warfare β€’ 138 implied HN points β€’ 24 Oct 23
  1. True Pressure Score (TPS) is a metric used to evaluate pass-rushers based on different types of pressures.
  2. The Pressure Quality Ratio (PQR) measures the efficiency of pass-rushers by comparing high quality pressures to low quality pressures.
  3. Players like Crosby and Hutchinson lead in TPS, while Garrett and Carter stand out in PQR with high efficiency ratios.
Unconfusion β€’ 39 implied HN points β€’ 31 Mar 24
  1. Using silly examples to teach correlation and causation can let students off too easily. It's important to challenge them with examples that make them think.
  2. Most teaching examples use time-series data, but many real-world correlations don't fit this model. We should focus on typical variations found in research.
  3. Mixing random correlations with spurious connections creates confusion. Teaching should clearly explain how confounders can lead to false relationships.
Logging the World β€’ 179 implied HN points β€’ 11 Dec 22
  1. In a raffle with a large number of tickets, the biggest number drawn out starts to show some structure as more tickets are selected.
  2. By looking at the maximum value drawn in a raffle, one can estimate the total number of tickets, a concept applied in statistics like the German tank problem.
  3. Sequential numbering schemes can reveal interesting insights, as seen in situations like the Skripal poisonings and Novak Djokovic's COVID test, highlighting the importance of careful numbering practices.
Dreams in the Which House β€’ 117 implied HN points β€’ 28 Jun 23
  1. The numbers around 'Cancel Culture' incidents are a topic of debate, especially in academia.
  2. In evaluating these numbers, it's crucial to consider the context of the data and how it's presented.
  3. Comparing modern 'Cancel Culture' scenarios with historical events like McCarthyism reveals nuances and complexities.
Mindful Modeler β€’ 319 implied HN points β€’ 08 Sep 22
  1. Focus on better machine learning by thinking like a statistician
  2. Prioritize model interpretation, paying attention to data, and maintaining a critical mindset
  3. Stay tuned for more updates and insights on mindfulmodeler.substack.com
Splitting Infinity β€’ 59 implied HN points β€’ 28 Jan 24
  1. The type of income distribution models used like Pareto or lognormal can impact total utility calculations in economics
  2. There is an interesting relationship observed where the degree of inequality doesn't directly correlate with total utility in certain scenarios
  3. Introducing more risk-averse utility functions can bring the focus back on the importance of inequality in calculations
Mindful Modeler β€’ 139 implied HN points β€’ 25 Apr 23
  1. Log odds are additive, probabilities are multiplicative. Some interpretation methods like expressing predictions as a linear sum may benefit from log odds.
  2. Edge transitions, like from 0.001 to 0.01, may sometimes be more significant than middle transitions, like 0.5 to 0.6.
  3. Probabilities offer intuitive understanding for decision-making, cost calculations, and are more commonly familiar compared to log odds.
A Piece of the Pi: mathematics explained β€’ 163 implied HN points β€’ 16 Dec 24
  1. The number e, around 2.718, plays a big role in math, especially in combinatorial problems like derangements. This is when items are arranged so that none are in their original position.
  2. In chess, setting up nonattacking rooks can be related to derangements. The chance that none of them land on the main diagonal equals about 36.8%, which links back to the number e.
  3. Recent studies have also looked at how many safe squares remain on a chessboard when placing random pieces. As more pieces are added, the proportion of safe squares follows certain patterns connected to e.
The DisInformation Chronicle β€’ 375 implied HN points β€’ 15 Feb 24
  1. A German newspaper forced Science Magazine to correct a study about the pandemic origin, while American science writers ignored new research questioning the study's validity.
  2. The Science Magazine study, claiming the pandemic began in a wet market, was criticized for its statistical methodology by experts from Germany and Hong Kong, raising doubts about its conclusions.
  3. Independent experts confirmed the criticism of the study, highlighting flaws in the statistical analysis and describing Science Magazine's handling of the methodology as careless and unprofessional.
Mindful Modeler β€’ 159 implied HN points β€’ 07 Mar 23
  1. Conformal prediction quantifies uncertainty in machine learning models by producing prediction sets or intervals.
  2. Conformal prediction offers a way to get reliable uncertainty quantification by calibrating the uncertainty score of ML models.
  3. The book 'Introduction to Conformal Prediction With Python' serves as a practical and easy-to-understand resource to learn about this uncertainty quantification method.
Brad DeLong's Grasping Reality β€’ 115 implied HN points β€’ 16 Feb 25
  1. The Gini Coefficient measures income equality, where 0 means everyone is equal and 1 means one person has everything. It helps us understand how wealth is distributed in a society.
  2. Intermediate Gini values can be tricky to interpret. It's hard to know what a score like 0.25 or 0.62 really means in terms of real-life inequality.
  3. Understanding historical Gini scores can give insight into how different societies experience inequality, but the differences might not always feel significant or clear.