The hottest Data interpretation Substack posts right now

And their main takeaways

Announcing My New Column in The Washington Post

Thinking in Bets • 138 implied HN points • 01 Nov 24

🇺🇸 U.S. Politics Data interpretation

Annie Duke is starting a new opinion column in The Washington Post, focusing on risk and decision-making. She'll share insights on how we interpret important data.
The column will discuss the misleading nature of data interpretation, particularly regarding Black voters' support in elections. Duke argues that misinterpretations can be more harmful than misinformation.
Annie's background as a decision scientist and former poker player helps her analyze how people make choices, which she'll explore in her writing.

A way to think about scientism

Wyclif's Dust • 1609 implied HN points • 05 Jun 25

📖 Philosophy Data interpretation

Scientism can happen when researchers make general claims about science without considering the limits of their studies. It's important for scientists to recognize when their findings may not apply broadly.
Social scientists often use big concepts that sound scientific, but they sometimes fail to acknowledge the unique context of their studies. This can lead to misleading conclusions about complex issues.
The way some researchers present their findings may resemble 'cargo cult science,' where they follow scientific methods superficially but miss the deeper understanding needed for true insights. It's essential to connect the rigor of research with the actual realities of the world.

ICE is not the Lizardman

a newsletter for infovores. • 91 implied HN points • 26 Jan 26

🇺🇸 U.S. Politics Data interpretation

Don’t automatically write off odd poll responses as random bad-faith answers; surprising percentages can represent real opinions that matter politically.
Nontrivial shares of people—even inside expected groups—can hold hawkish or conspiratorial views, so small percentages can still equal large, consequential numbers.
Before dismissing a result, check the question wording, pollster credibility, timing, survey method, and whether other sources corroborate it to judge if it’s noise or a real signal.

Learning Not to Trust the All-In Podcast in Ten Minutes

Passing Time • 3816 implied HN points • 06 Nov 24

🎙 Podcasts Data interpretation

A major claim about government spending's role in GDP growth was proven incorrect with simple research. It turns out only about 30% of recent GDP growth was due to government spending, not the 85% stated.
The podcast hosts did not provide critical analysis or challenge each other's claims during the discussion, which raises concerns about their credibility.
It's important to verify information from sources you trust, especially when it comes to economic data, to avoid being misled.

7 perspectives on machine learning

Mindful Modeler • 279 implied HN points • 09 Apr 24

🕹 Technology Data interpretation

Machine learning is about building prediction models. It covers a wide range of applications, but may not be perfect for unsupervised learning.
Machine learning is about learning patterns from data. This view is useful for understanding ML projects beyond just prediction.
Machine learning is automated decision-making at scale. It emphasizes the purpose of prediction, which is to facilitate decision-making.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

How to deal with non-i.i.d data in machine learning

Mindful Modeler • 479 implied HN points • 09 Jan 24

🕹 Technology Data interpretation

Dealing with non-i.i.d data in machine learning can prevent data leakage, overfitting, and overly optimistic performance evaluation.
For modeling data with dependencies, classical statistical approaches like mixed effect models can be used to correctly estimate coefficients.
In non-i.i.d. data situations, the data splitting setup must align with the real-world use case of the model to avoid issues like row-wise leakage and over-optimistic model performance.

Which was warmer: the 1930s or the last 10 years

The Climate Brink • 550 implied HN points • 15 May 23

🌞 Climate & Environment Data interpretation

Cherry picking data can distort the reality and mislead people.
Climate contrarians use cherry-picked facts to cast doubt on climate change.
Global data shows that current temperatures are higher than in the 1930s.

Stealing Signals, Week 8, Part 1

Stealing Signals • 439 implied HN points • 31 Oct 23

🎾 Sports Data interpretation

Teams may not always give 100% effort every game in the NFL due to strategic reasons.
Watching games can give a big advantage in fantasy football over just looking at stats.
First-read targets dataset may not accurately reflect offensive intentions in play calling and should be analyzed cautiously.

The Galactic Guide to SHAP Values

Mindful Modeler • 279 implied HN points • 25 Jul 23

🕹 Technology Data interpretation

SHAP values are like forces acting on a planet in a universe analogy, helping explain machine learning model predictions
Each feature in a machine learning model contributes as a force, with SHAP values showing how they impact the prediction
SHAP values aim to maintain the prediction's equilibrium by considering all forces, revealing which features are vital

Correlation Can Ruin Interpretability

Mindful Modeler • 479 implied HN points • 20 Sep 22

🔬 Science Data interpretation

Correlation between features can significantly impact the interpretability of machine learning models, both technically and philosophically.
Identifying and addressing correlation issues is crucial for accurate model interpretation. Techniques include grouping correlated features, decorrelation methods like PCA, feature selection, causal modeling, and conditional interpretation.
Entanglement of interpretation due to correlation makes it challenging to isolate the impact of individual features in machine learning models.

Coming soon

Mindful Modeler • 319 implied HN points • 08 Sep 22

🕹 Technology Data interpretation

Focus on better machine learning by thinking like a statistician
Prioritize model interpretation, paying attention to data, and maintaining a critical mindset
Stay tuned for more updates and insights on mindfulmodeler.substack.com

Video Lecture: Data-Reality Gaps

FILWD • 39 implied HN points • 30 Jan 24

🚌 Education Data interpretation

Data-reality gaps exist when there is disconnect between data representation and reality
A data generation model helps in identifying gaps like selection bias and interpretation gap
Understanding different gaps in data can lead to more accurate visualization and interpretation

Improve Post-Hoc Interpretation By Leveraging Background Data

Mindful Modeler • 99 implied HN points • 21 Mar 23

🔬 Science Data interpretation

Utilize background data creatively in analysis by considering it as more than just a nuisance for estimation
Leverage background data to explore different scenarios like distribution shifts, feature effects in various data groups, and stability of model predictions
Background data plays a crucial role in model-agnostic interpretation methods like Shapley values and permutation feature importance, providing opportunities to enhance analysis by smart selection

Hospitals are not hiding Covid cases. There really isn't much Covid around right now. Be wary of those who say otherwise.

Independent SAGE continues • 19 implied HN points • 04 Apr 24

🏥 Health Politics Data interpretation

Currently, there are low levels of Covid in hospitals and the community. The data suggest that the situation is better than many people think.
Some claims about high Covid cases and hospitalizations are misleading. It's important to examine the evidence and context behind those claims.
Overall, the chances of getting severely sick from Covid are much lower now than before, thanks largely to vaccinations and improved immunity.

Correlation is not a percentage

inexactscience • 39 implied HN points • 22 Jul 23

🚌 Education Data interpretation

Correlation does not mean one thing causes another. Just because two things are related doesn't mean one causes the other.
Many people mistakenly think the correlation coefficient is a percentage. This can be misleading and lead to wrong conclusions.
To understand how much one thing explains another, use the coefficient of determination, not the correlation. Squaring the correlation gives you a clearer picture of the relationship.

Nonobody's Mathematical Bio-Pianolas

Cybernetic Forests • 19 implied HN points • 09 Jul 23

🔬 Science Data interpretation

The story explores the disconnect between data produced by the body and how machines interpret it, highlighting the complexities in translating and calibrating data.
It questions the dangers of misinterpreting brain activity as a linear flow of information, emphasizing the importance of understanding gaps when reconstructing signals.
The narrative offers a prescient warning about the misuse of automated statistical analysis systems to determine societal control based on physical characteristics, urging critical examination of the tools and notions used.

A Tourist's Guide to Datasets

Cybernetic Forests • 59 implied HN points • 04 Jul 21

🕹 Technology Data interpretation

Machines understand models of reality through data, influenced by what is deemed significant, leading to gaps and potential misinterpretations.
Datasets are contextual and not universally applicable, emphasizing the importance of clear documentation and awareness of data limitations.
Creating a 'Tourist's Guide to Datasets' with annotations and personal insights can enhance understanding and avoid misuse when data is reused for different purposes.

GRANTED: Having more thoughtful arguments and being a more thoughtful mentee

Granted • 19 implied HN points • 04 Aug 19

🚌 Education Data interpretation

Strive to be better, not the best. The competition should be with your past and future self.
Data doesn't really talk, people interpret it. Question the competence and integrity of data interpreters.
To be a good mentee, value mentor's time, seek clear guidance, be open to ideas and reflect on your progress.

Mean Reversion: Gravitational Super Force or Dangerous Delusion?

Musings on Markets • 0 implied HN points • 31 Aug 16

💰 Finance Data interpretation

Mean reversion is the idea that extreme results will return to the average over time. This is seen in sports and investing, but it can lead us to make wrong assumptions about future performance.
There are two types of mean reversion: time series mean reversion, which looks at past average values over time, and cross-sectional mean reversion, which compares values against the average of similar items. Both have their own risks and assumptions.
Structural changes in the economy or companies can disrupt mean reversion, meaning trusting it too much could lead to poor investment decisions. It's important to stay aware of these changes and not just rely on historical data.