The hottest Statistics Substack posts right now

And their main takeaways

Free Friday: Choose the Hall of Famer

JoeBlogs • 2044 implied HN points • 02 Feb 24

The game 'Choose the Hall of Famer' challenges perceptions about player value based on stats and accomplishments.
Comparison between players like Jim Plunkett and Joe Namath shows that stats alone may not dictate Hall of Fame worthiness.
Analyzing players like Scott Rolen and Jim Edmonds reveals how defensive contributions can impact Hall of Fame considerations.

Perils of Flawed Meta-Analytic Methodology

The Shores of Academia • 39 implied HN points • 03 Oct 24

🔬 Science Health Research Statistics Psychology Methodology

Flawed meta-analysis can mix different studies that aren't similar, making it hard to draw clear conclusions about their effects on things like mental health.
It’s important for researchers to look at specific impacts and not just assume that a random-effects model explains everything. Understanding the differences in outcomes can lead to better insights.
Proper analysis in studies is really important, especially when people's health is at risk. Ignoring negative findings can mislead people about the safety of products like drugs.

Case-Shiller: National House Price Index Up 2.7% year-over-year in April

CalculatedRisk Newsletter • 9 implied HN points • 24 Jun 25

💰 Finance Real Estate Economic Trends Housing Market Statistics Market Analysis

The national house price index is up by 2.7% over the past year, showing a general increase in home prices.
However, there was a month-to-month decrease of 0.4% in home prices in April, indicating some fluctuation in the market.
Certain regions are seeing lower gains or even declines, suggesting a shift in real estate trends across the country.

Chess and the number e

A Piece of the Pi: mathematics explained • 163 implied HN points • 16 Dec 24

🔬 Science Mathematics Combinatorics Statistics Game Theory Logic

The number e, around 2.718, plays a big role in math, especially in combinatorial problems like derangements. This is when items are arranged so that none are in their original position.
In chess, setting up nonattacking rooks can be related to derangements. The chance that none of them land on the main diagonal equals about 36.8%, which links back to the number e.
Recent studies have also looked at how many safe squares remain on a chessboard when placing random pieces. As more pieces are added, the proportion of safe squares follows certain patterns connected to e.

Tormented by an Urn

rachaelmeager • 535 implied HN points • 04 Jun 24

🚌 Education Teaching Learning Mathematics Statistics Philosophy

The Polya urn model, though simple at first glance, reveals the complexity of statistics and emphasizes the importance of understanding problems deeply before attempting to solve them.
Teaching and learning in math are not just about facts; they require creativity and passion to engage students, much like how poets perceive deeper meanings in their art.
There is a strong connection between the arts and sciences, where both disciplines can benefit from understanding each other, and students should learn foundational concepts in both to grasp the complexities of the world.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Willful ignorance of the male suicide crisis

Of Boys and Men • 495 implied HN points • 10 Oct 24

🏥 Health Politics Mental health Public Policy Gender Issues Statistics

Many reports on suicide focus too much on girls, giving the impression that they are at a higher risk, which is misleading. In fact, most suicides among teenagers involve boys.
The media often discusses the feelings of sadness and suicidal thoughts in girls but fails to provide clear statistics on the actual suicide rates by gender. This can create confusion about who is really most at risk.
It's essential to acknowledge the growing suicide crisis among young men and include accurate data in discussions to better address mental health issues for everyone. We need to talk about both boys and girls honestly.

Statistical modeling seen through inductive biases

Mindful Modeler • 419 implied HN points • 28 May 24

🔬 Science Statistics Modeling Machine Learning

Statistical modeling involves modeling distributions and assuming relationships between features and the target with a few interpretable parameters.
Distributions shape the hypothesis space by restricting the range of models compatible with specific distributions like a zero-inflated Poisson distribution.
Parameterization in statistical modeling simplifies estimation, interpretation, and inference of model parameters by making them more interpretable and allowing for confidence intervals.

Momentum and the Gambler's Fallacy

Admired Leadership Field Notes • 1022 implied HN points • 11 Feb 24

🎾 Sports Analysis Performance Strategy Trends Statistics

Momentum in sports can lead to a shift in energy and positivity, affecting the outcome of a game.
Even though statistical experts claim momentum is not real and linked to the gambler's fallacy, it is a common occurrence in sports that can impact a team's performance.
Teams that effectively harness momentum by maintaining a streak of positive outcomes have a higher probability of winning, as seen in data analysis of NFL games.

How to evaluate statistical claims

The Counterfactual • 199 implied HN points • 27 Jun 24

🚌 Education Statistics Research Methods Data Analysis Critical Thinking

Always look at the whole distribution of data, not just the average. The average can be affected by extreme values, so it's crucial to see the bigger picture to understand what the data really tells us.
Consider the baseline or reference point when evaluating numbers. Knowing how a number compares to others helps us understand if it's large or small, which gives us better context.
Understand the story behind the data-generating process. This means recognizing the factors that led to the results we see, which helps in identifying possible biases or alternative explanations.

All Bayes Everything

Holodoxa • 239 implied HN points • 14 Jun 24

🔬 Science Statistics Probability Neurobiology Consciousness Research

Bayes' Theorem is a powerful concept in probability theory that helps update beliefs based on new evidence, highlighting the importance of combining prior knowledge and new data.
Bayesian methods can offer valuable improvements to scientific research practices by emphasizing uncertainty, effect magnitude, and probability distributions over traditional p-values and null hypothesis testing.
The concept of the brain functioning as a prediction machine aligns with Bayesian principles, suggesting that the brain uses prior knowledge and new sensory inputs to make predictions and construct conscious experiences.

Beware the Univariate Fallacy

Reality's Last Stand • 1474 implied HN points • 27 Mar 23

🔬 Science Statistics Biology Gender Activism

The Univariate Fallacy manipulates using single-variable focus to distort reality and push agendas.
There are two versions: one exaggerates group differences, and the other minimizes them.
This fallacy is used to justify false depictions of reality, especially regarding sex and gender.

The Data Dam Break of May 2023: How Twitter Challenged the Racial Industrial Complex

The Rabbit Hole • 1395 implied HN points • 11 May 23

🇺🇸 U.S. Politics Race Media Statistics Social media

Data is not racist, but there is a stigma around discussing data on certain topics.
Challenging dominant narratives and spreading reliable information is important.
Engaging with data, asking questions, and using platforms like Twitter can lead to expanding the discourse and challenging establishment ideologies.

How I made peace with quantile regression

Mindful Modeler • 778 implied HN points • 16 Jan 24

🔬 Science Statistics Machine Learning Estimation Modeling

Quantile regression can be understood through the lens of loss optimization, specifically with the pinball loss function.
In machine learning, quantile regression is essentially regression with the unique pinball loss function that emphasizes absolute differences between actual and predicted values.
The asymmetry of the pinball loss function, controlled by the parameter tau, dictates how models should handle under- and over-predictions, making quantile regression a tool to optimize different quantiles of a distribution.

Reverse Genetic Systems

A Biologist's Guide to Life • 58 implied HN points • 23 Dec 24

🔬 Science Biology Virology Genetics Statistics

There are two main theories about the origin of SARS-CoV-2: one is that it came from animal trade, and the other is that it originated in a lab. Each theory has its own set of details that scientists are still investigating.
Understanding the origins of the virus requires knowledge of both biology and complex statistical methods. These methods help researchers weigh the evidence carefully, which is crucial for arriving at the most likely explanation.
The evidence increasingly suggests that the virus may have come from a lab, especially noting the features like the furin cleavage site that were put into a reverse genetic system. This raises important questions about how we study viruses and their potential risks.

The Effects of Immigration in Denmark

Patterns in Humanity • 1159 implied HN points • 17 Feb 23

🌍 World Politics Immigration Crime Statistics Analysis Social Impact

First, there is a detailed analysis of the financial impact of immigration in Denmark based on a government report.
Second, the analysis explores the rates of violent crime convictions by nation of origin, showing disparities between groups.
Lastly, the importance of adjusting for age and sex in understanding the differences in financial contributions and crime rates among immigrants is highlighted.

Common Terminology and Statistics Issues- Part 2

Weight and Healthcare • 738 implied HN points • 27 Dec 23

🏥 Health & Wellness Statistics Obesity Interventions Research Terminology

Using percentages without proper context can be misleading, it's crucial to provide a full picture for accurate interpretation.
Understanding the difference between relative and absolute risk in statistics can prevent manipulation and provide a clearer view of the data.
Different methods for handling dropouts in trials, like LOCF and BOCF, can impact outcomes significantly and need careful consideration in research.

Some historically bad basketball

Marc Stein • 668 implied HN points • 09 Jan 24

🎾 Sports Basketball NBA Teams Statistics History

Several NBA teams are performing historically poorly this season, being outscored by at least 10 points per game.
Ja Morant's season-ending injury adds to the struggles faced by the Memphis Grizzlies, impacting their performance in the league.
The list of NBA teams with significant negative point differentials this season is unprecedented, with four teams facing double-digit losing margins.

Numbers Game: The NBA Midpoint

Marc Stein • 589 implied HN points • 24 Jan 24

🎾 Sports Basketball NBA Statistics Coaching Players

There has been only one in-season coaching change in the NBA so far this season.
Joel Embiid of the Philadelphia 76ers has been scoring impressively with 1,156 points in 32 games.
The Boston Celtics were the first team to reach 30 wins this season.

The NBA's Baseball Series Chronicles

Marc Stein • 589 implied HN points • 17 Jan 24

🎾 Sports Basketball Statistics Interviews Analysis Podcasts

The NBA has been implementing two-game baseball series for scheduling efficiency.
Historical data shows that splits are the most likely outcomes in these series.
Home teams in the NBA have historically had an average winning percentage of .586.

The 100,000. Why isn't the Interracial Murder rate higher?

datahazard • 943 implied HN points • 22 Mar 23

🇺🇸 U.S. Politics Crime Race Statistics Violence Government

Blacks are 9.8x more likely to commit inter-racial murder than Whites
Whites have had nearly 100,000 more victims of inter-racial murder from 1968-2021
Understanding 2021 rates shows a Black person is 3.1x more likely to commit inter-racial murder than a White person

Despite Biden's claims, Gaza health ministry death toll is accurate, say peer-reviewed scientific studies

Geopolitical Economy Report • 637 implied HN points • 20 Dec 23

🌍 World Politics Gaza Health Statistics Israel US

Peer-reviewed scientific studies confirm the accuracy of the Gaza health ministry's death toll statistics after criticism from US President Biden.
The Gaza health ministry has a history of reporting reliable figures, crucial for international organizations' use in understanding the situation.
Experts from Johns Hopkins University and the London School of Hygiene & Tropical Medicine found no evidence of inflated mortality reporting and confirmed the validity of the data provided by the Palestinian MoH.

Chutes, ladders, and Markov chains

A Piece of the Pi: mathematics explained • 72 implied HN points • 04 Dec 24

🚌 Education Mathematics Statistics Game Theory Probability Teaching

The game of Chutes and Ladders is a fun example of a Markov chain. It shows how the next move depends only on where you are now, not on how you got there.
There are different types of game boards, some allow for winning while others can trap players forever. Ultimately winnable boards guarantee that a player can reach the end if they keep playing.
On average, players need about 39 spins to win the game, and surprisingly, most random boards created will still offer a winning chance.

What I'm reading (April 2024 edition)

Alberto Cairo's The Art of Insight • 199 implied HN points • 27 Apr 24

📚 Literature Books Reading Non-fiction Science Statistics

Statistics and probability have a complex history that affects many sciences today. It's important to understand that probability is more about uncertainty than just measuring how often something happens.
Books like 'Normality' explore how the idea of normal has been used to marginalize certain groups of people. The meanings of normal have changed over time and can be harmful.
The connection between different thinkers and ideas can help us understand reality better. Books like 'The Rigor of Angels' look at these links and ask important questions about what we truly know.

The Hall of Fame is about more than WAR

Silver Bulletin • 232 implied HN points • 06 Jan 25

🎾 Sports Baseball Analytics Hall of Fame Player evaluation Statistics

The Hall of Fame should consider many factors, not just one statistic like Wins Above Replacement (WAR). This means looking at achievements, player talent, and character too.
Players might have high WAR scores but lack the greatness often associated with Hall of Fame status. For example, a consistent but average player shouldn't necessarily be in the Hall over a standout who had fewer career years.
Voters for the Hall of Fame are required to consider a player's overall impact, including postseason performances and fan appeal. This makes it a more complex decision than just focusing on statistics.

Using the Binomial Effect Size Display (BESD) to understand correlations

Just Emil Kirkegaard Things • 373 implied HN points • 05 Feb 24

🔬 Science Statistics Correlations Research Methods Medical Science

Interpreting size of correlations using Cohen guidelines - small, medium, large
Comparing effect sizes to others in the literature for context
Understanding correlations using Binomial Effect Size Display (BESD) - practical applications

Cooking The Books

Points And Figures • 719 implied HN points • 11 Mar 24

🇺🇸 U.S. Politics Economy Government Statistics Unemployment

Government-reported economic numbers can be misleading, especially in non-democratic countries where they may be fake.
Statistic revisions are common in economic releases, but major revisions like a 35% drop raise concerns about accuracy.
Unemployment numbers from the US Department of Labor under President Biden have seen significant and questionable revisions, impacting predictions and planning based on them.

Stuff I found interesting in December

Samstack • 1537 implied HN points • 31 Dec 23

🔬 Science Research Statistics Psychology Neuroscience Genetics

Be cautious of assuming correlation implies causation, as the sign can be opposite of the true effect
Income inequality in America may not have risen much since the 1960s, contrary to popular belief
Anti-immigration voters often consider the issue more important than pro-immigration voters, impacting public perception

Gini Coefficients...

Brad DeLong's Grasping Reality • 115 implied HN points • 16 Feb 25

🇺🇸 U.S. Politics Inequality Economics Statistics Social Policy Education

The Gini Coefficient measures income equality, where 0 means everyone is equal and 1 means one person has everything. It helps us understand how wealth is distributed in a society.
Intermediate Gini values can be tricky to interpret. It's hard to know what a score like 0.25 or 0.62 really means in terms of real-life inequality.
Understanding historical Gini scores can give insight into how different societies experience inequality, but the differences might not always feel significant or clear.

Film Room: How Auburn's defense has been so dominant so far in SEC play

The Auburn Observer • 373 implied HN points • 22 Jan 24

🎾 Sports Basketball Defense Statistics Efficiency Team Dynamics

Auburn's defense in SEC play has been dominant, holding most opponents to 65 points or fewer.
Bruce Pearl expressed early concerns about their defense, but now they are performing exceptionally well.
Auburn's defense leads the SEC in efficiency, field goal percentages, and turnover rate, showcasing a strong team commitment to defensive play.

Exactly what happens during the 2023-24 NBA season is exclusively revealed here today*

Marc Stein • 628 implied HN points • 24 Oct 23

🎾 Sports NBA Basketball Predictions Statistics Interview

The post reveals the results of a simulated 2023-24 NBA season.
The Strat-O Celtics win the championship in the simulation, with interesting outcomes and player performances.
The author also shares his predictions for MVP, Rookie of the Year, and other awards for the upcoming NBA season.

The Chicago Cubs Really Were Hurt by Playing Day Games

Something to Consider • 139 implied HN points • 09 May 24

🎾 Sports Baseball Statistics Team performance Game analysis Historical context

The Chicago Cubs had many daytime games which may have made them tired in the second half of the season. This could explain why they didn't perform as well later in the year.
The team only started playing night games in 1988, much later than other teams, which might have also hurt their performance.
Even today, the Cubs have fewer night games compared to other teams, and this could still affect their chances of winning.

Issues with Terminology and Statistics in Weight Science - Part 1

Weight and Healthcare • 459 implied HN points • 13 Dec 23

🏥 Health & Wellness Terminology Statistics Weight loss industry

The weight loss industry manipulates terminology to market weight loss as a treatment for obesity, leading to misconceptions and ineffective interventions.
The term 'weight-related conditions' is often used inaccurately to imply causation, ignoring confounding variables like weight stigma and healthcare disparities.
The concept of 'sustained weight loss' is sometimes misrepresented by the weight loss industry to imply success, when in reality, it often refers to temporary weight loss followed by regain.

Three Common Statistics Snafus in Weight Science

Weight and Healthcare • 718 implied HN points • 19 Apr 23

🏥 Health & Wellness Weight Science Statistics Research Body Weight

Repeated attempts at intentional weight loss can have decreasing odds of success, and weight cycling can lead to significant harm.
Just because a study result is statistically significant doesn't necessarily mean the effect is important or impactful.
Understanding the actual increase in risk percentage versus the absolute risk percentage is crucial in healthcare decision-making.

Machine learning never cheats but it may play flawed games

Mindful Modeler • 259 implied HN points • 27 Feb 24

🕹 Technology Machine Learning Data Analysis Statistics Data generation

Machine learning models may use shortcuts or exploit quirks in data, but it's important to consider them as playing the game according to the rules set by the data.
Detecting flaws in prediction games is crucial, as models can unintentionally learn and act on misleading information from the data.
Designing prediction games effectively requires a deep understanding of the data-generating process, tools like sampling theory, design of experiments, and a statistical mindset can be valuable in shaping prediction tasks.

Nathan Yau's friendly voice

Alberto Cairo's The Art of Insight • 99 implied HN points • 29 May 24

🎨 Art & Illustration Data Visualization Infographics Statistics Graphic design Book Reviews

Nathan Yau is known for making data visualization fun and approachable, both in his blog and his book, 'Visualize This'.
The second edition of 'Visualize This' offers updated examples and tools, making it more cohesive than the first edition.
Reading Yau's work feels like getting hands-on help from an experienced designer, which makes learning enjoyable.

Self-indulgent anniversary post

Logging the World • 518 implied HN points • 04 Nov 23

📰 News COVID-19 Education Statistics Social media Government

The author reflects on their first year on Substack, the experience of a post going viral, and their content on COVID and other topics.
The post discusses the author's favorite non-COVID topics including a clever idea, an education policy, and the joys of walking.
The article highlights the impact of a post on Dominic Cummings boosting views, emphasizes the unpredictable nature of virality, and teases future discussions on the UK COVID Inquiry.

Natural selection: what's the effect size?

Wyclif's Dust • 2414 implied HN points • 07 Apr 23

🔬 Science Genetics Evolution Statistics

Many polygenic scores are significantly correlated with the number of children a person has, indicating a link between genetics and fertility.
The effect sizes of these correlations can be influenced by the accuracy of the polygenic scores, with noise potentially underestimating the true effects.
Improving polygenic scores and considering the impact of rare genetic variants are important for better understanding the relationship between genetics, fertility, and education.

Workshop announcement: Causal I

Scott's Substack • 334 implied HN points • 12 Jan 24

🚌 Education Causal Inference Statistics Teaching Learning

Workshop announcement for Causal Inference I starting on February 3rd.
Key topics covered in the workshop include potential outcomes and selection bias.
The importance of letting go of anger, bitterness, and seeking human connection in New Year's resolutions.

America's Peacetime Conundrum

Max Meyer Blog • 569 implied HN points • 10 Sep 23

🏥 Health Politics Mental health Memorials Legislation Statistics

In 2022, there were no US soldier combat deaths but over 300 soldier suicides.
The number of veteran suicides has been consistently higher than the civilian rate, with rates increasing over time.
Efforts have been made to address veteran suicide, including anti-suicide legislation and national strategies, but the impact is still being observed.

The 9.3x Factor; Data Isn't Hate Speech

datahazard • 550 implied HN points • 12 May 23

🇺🇸 U.S. Politics Social Issues Crime Statistics Censorship Media

A Black person is 9.3x more likely to murder a White than a White person is to murder a Black.
Comparing murder rates between different population groups can lead to misleading conclusions.
It's important to consider more meaningful rates, like the 'Stereotype Rate', when analyzing murder statistics.

The hottest Statistics Substack posts right now

JoeBlogs • 2044 implied HN points • 02 Feb 24

The Shores of Academia • 39 implied HN points • 03 Oct 24

CalculatedRisk Newsletter • 9 implied HN points • 24 Jun 25

A Piece of the Pi: mathematics explained • 163 implied HN points • 16 Dec 24

rachaelmeager • 535 implied HN points • 04 Jun 24

Of Boys and Men • 495 implied HN points • 10 Oct 24

Mindful Modeler • 419 implied HN points • 28 May 24

Admired Leadership Field Notes • 1022 implied HN points • 11 Feb 24

The Counterfactual • 199 implied HN points • 27 Jun 24

Holodoxa • 239 implied HN points • 14 Jun 24

Reality's Last Stand • 1474 implied HN points • 27 Mar 23

The Rabbit Hole • 1395 implied HN points • 11 May 23

Mindful Modeler • 778 implied HN points • 16 Jan 24

A Biologist's Guide to Life • 58 implied HN points • 23 Dec 24

Patterns in Humanity • 1159 implied HN points • 17 Feb 23

Weight and Healthcare • 738 implied HN points • 27 Dec 23

Marc Stein • 668 implied HN points • 09 Jan 24

Marc Stein • 589 implied HN points • 24 Jan 24

Marc Stein • 589 implied HN points • 17 Jan 24

~~datahazard~~ • 943 implied HN points • 22 Mar 23

Geopolitical Economy Report • 637 implied HN points • 20 Dec 23

A Piece of the Pi: mathematics explained • 72 implied HN points • 04 Dec 24

Alberto Cairo's The Art of Insight • 199 implied HN points • 27 Apr 24

Silver Bulletin • 232 implied HN points • 06 Jan 25

Just Emil Kirkegaard Things • 373 implied HN points • 05 Feb 24

Points And Figures • 719 implied HN points • 11 Mar 24

Samstack • 1537 implied HN points • 31 Dec 23

Brad DeLong's Grasping Reality • 115 implied HN points • 16 Feb 25

The Auburn Observer • 373 implied HN points • 22 Jan 24

Marc Stein • 628 implied HN points • 24 Oct 23

Something to Consider • 139 implied HN points • 09 May 24

Weight and Healthcare • 459 implied HN points • 13 Dec 23

Weight and Healthcare • 718 implied HN points • 19 Apr 23

Mindful Modeler • 259 implied HN points • 27 Feb 24

Alberto Cairo's The Art of Insight • 99 implied HN points • 29 May 24

Logging the World • 518 implied HN points • 04 Nov 23

Wyclif's Dust • 2414 implied HN points • 07 Apr 23

Scott's Substack • 334 implied HN points • 12 Jan 24

Max Meyer Blog • 569 implied HN points • 10 Sep 23

~~datahazard~~ • 550 implied HN points • 12 May 23

datahazard • 943 implied HN points • 22 Mar 23

datahazard • 550 implied HN points • 12 May 23