The hottest Statistics Substack posts right now

And their main takeaways
Category
Top Technology Topics
Chartography β€’ 98 implied HN points β€’ 05 Oct 23
  1. Malcolm Gladwell's podcast series about guns focuses on narratives, nuance, and statistics.
  2. The modern advancement in trauma medicine impacts our perception of gun violence data.
  3. Access to trauma centers and quality healthcare play key roles in addressing gun violence disparities.
Mindful Modeler β€’ 179 implied HN points β€’ 24 Jan 23
  1. Understanding the fundamental difference between Bayesian and frequentist interpretations of probability is crucial for grasping uncertainty quantification techniques.
  2. Conformal prediction offers prediction regions with a frequentist interpretation, similar to confidence intervals in linear regression models.
  3. Conformal prediction shares similarities with the evaluation requirements and mindset of supervised machine learning, emphasizing the importance of separate calibration and ground truth data.
The Good Science Project β€’ 122 implied HN points β€’ 26 Jan 25
  1. Top scientific journals sometimes have trouble understanding basic statistics. This can lead to confusion and errors that affect research outcomes.
  2. A recent case showed that reviewing a paper could involve contradictory requests, like asking for a post-hoc power analysis, which is generally not helpful after results are already obtained.
  3. Researchers should not rely solely on journal editors for correct statistical advice. The system needs to improve how it addresses statistical issues in published studies.
Mindful Modeler β€’ 219 implied HN points β€’ 25 Oct 22
  1. The mindset of the modeler significantly influences the use and interpretation of models.
  2. There are various modeling mindsets such as frequentist inference, Bayesian inference, causal inference, and supervised machine learning, all of which can lead to the same final model.
  3. Different tasks require different modeling mindsets, and being well-versed in multiple mindsets can be beneficial for a data scientist.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Jeff-alytics β€’ 78 implied HN points β€’ 30 Jun 23
  1. Murder rates are decreasing in big and small cities across the US.
  2. Guesstimating the national murder trend is challenging due to lack of standardized reporting processes.
  3. Leading indicators, like the Gun Violence Archive, suggest a potential 8-10% decline in national murders for 2023.
The Software & Data Spectrum β€’ 78 implied HN points β€’ 13 Apr 23
  1. Bayesian Statistics is used in various fields like Machine Learning, Engineering, Data Science, and more.
  2. Bayesian Thinking involves observing data, holding prior beliefs, forming hypotheses, gathering evidence, and comparing hypotheses.
  3. Probability is a way to measure belief strength, and calculating probabilities involves counting outcomes and using ratios of beliefs.
Nerology β€’ 142 implied HN points β€’ 29 Oct 24
  1. The project turns election predictions into real newspaper headlines, making stats feel more concrete. Each data point in the simulations gets a corresponding news story.
  2. Using a script, detailed election results from states can be generated, summarizing victories and close races. This gives journalists useful info to write about.
  3. AI tools were utilized to create news articles and images, making the project visually appealing and engaging. The tech helps bring the election outcomes to life with visuals and compelling stories.
The DisInformation Chronicle β€’ 270 implied HN points β€’ 12 Feb 24
  1. A group of virologists, including Anthony Fauci, may have intentionally diverted attention away from a possible lab accident in Wuhan at the start of the pandemic.
  2. An analysis published in a British science journal has found that the Science Magazine study advocating for the market origin of COVID is based on flawed statistics, contradicting the claims made in the study.
  3. While American media has largely ignored the analysis questioning Science Magazine's study, German journalists, like those from the weekly science magazine Spektrum, have reported on it.
Mindful Modeler β€’ 159 implied HN points β€’ 29 Nov 22
  1. Causal inference can be challenging to start due to various obstacles like diverse approaches and neglected education on the topic.
  2. Understanding causal inference involves adjusting your modeling mindset to view it as a unique approach rather than just adding a new model.
  3. Key insights for causal inference include the importance of directed acyclic graphs, starting from a causal model, and the challenges of estimating causal effects from observational data.
12challenges β€’ 257 implied HN points β€’ 01 Mar 24
  1. The diagram shows how much social media has changed over the last 20 years, with a shift towards platforms like TikTok.
  2. The idea of using the diagram as a menu to choose preferred social media options is intriguing, revealing possible disparities in usage.
  3. The author seeks suggestions to improve the diagram's presentation and structure, anticipating future articles about social media platforms.
Ladyparts β€’ 239 implied HN points β€’ 15 Jun 22
  1. We need to openly discuss and destigmatize STDs like herpes and HPV to prevent further spread and promote honesty in relationships.
  2. Many people do not disclose their STD status to their partners, highlighting the importance of getting tested and being honest in relationships.
  3. Sexually transmitted infections are increasing among older adults, emphasizing the importance of prioritizing safe sex practices at any age.
David Friedman’s Substack β€’ 260 implied HN points β€’ 29 Jan 24
  1. Words like 'exponential' and 'organic' are commonly misused with meanings different from their actual definitions.
  2. Terms like 'guarantee' and 'literally' are often used incorrectly causing confusion in communication.
  3. Understanding technical terms like 'statistically significant' is crucial to avoid misinterpretation in discussions.
Cybernetic Forests β€’ 59 implied HN points β€’ 02 Jul 23
  1. Language can be seen as a dynamic city, shaped by collective contributions that form its intricate structure.
  2. Generative AI models, like GPT4, rely on statistics and random selection to produce text, often betraying a lack of true understanding.
  3. Human communication involves a choice between shallow, statistically-driven speech, like that of machines, and deeper, intent-driven speech that seeks to convey personal truths.
Chartography β€’ 58 implied HN points β€’ 18 Jul 23
  1. A seminar by RJ Andrews on data visualization is happening this Thursday at the American Statistical Association
  2. Join the virtual tour of spectacular information graphics by registering for the ASA seminar
  3. The American Statistical Association has a rich history in data visualization, featuring leaders like Florence Nightingale
Technology Made Simple β€’ 59 implied HN points β€’ 14 Mar 23
  1. Analyzing the distribution of your data is crucial for accurate analysis results, helps in choosing the right statistical tests, identifying outliers, and confirming data collection systems.
  2. Common techniques to analyze data distribution include histograms, boxplots, quantile-quantile plots, descriptive statistics, and statistical tests like Shapiro-Wilk or Kolmogorov-Smirnov.
  3. Common mistakes in analyzing data distribution include ignoring or dropping outliers, using the wrong statistical test, and not visualizing data to identify patterns and trends.
Trench Warfare β€’ 59 implied HN points β€’ 05 Oct 23
  1. The author created True Pressure Score (TPS) and Pressure Quality Ratio (PQR) to analyze pass-rushers' effectiveness.
  2. The top 10 pass-rushers are ranked based on True Pressure Score (TPS) and Pressure Quality Ratio (PQR).
  3. Myles Garrett has an exceptional Pressure Quality Ratio (PQR) of 16.0, showing high-quality pressures.
inexactscience β€’ 39 implied HN points β€’ 16 Nov 23
  1. When people get more information, they often underreact instead of overreact. This means they might ignore new data instead of properly adjusting their predictions.
  2. Experiments showed that when faced with two variables, people made less accurate forecasts. Adding complexity actually made their predictions worse.
  3. Having clear instructions and understanding of the information really helps improve decision-making. If people are confused, they tend to ignore important details.
A Piece of the Pi: mathematics explained β€’ 72 implied HN points β€’ 04 Dec 24
  1. The game of Chutes and Ladders is a fun example of a Markov chain. It shows how the next move depends only on where you are now, not on how you got there.
  2. There are different types of game boards, some allow for winning while others can trap players forever. Ultimately winnable boards guarantee that a player can reach the end if they keep playing.
  3. On average, players need about 39 spins to win the game, and surprisingly, most random boards created will still offer a winning chance.
Data Science Weekly Newsletter β€’ 99 implied HN points β€’ 27 Jan 23
  1. Exploratory programming is important for data teams. It helps them find insights rather than just building software.
  2. Most datasets are not normally distributed, and there are many tests to check this but they can be tricky to use.
  3. AI is gaining a lot of attention, similar to what crypto once had. People are questioning if it can keep that interest alive.
The Cosmopolitan Globalist β€’ 17 implied HN points β€’ 01 Aug 25
  1. In the early 1930s, Stalin attacked the idea of accurate data and statistics. If data showed problems, he blamed the people reporting it.
  2. Stalin's regime would punish statisticians who reported bad news, which led to fear and manipulation of information.
  3. The focus on false data meant that real issues, like famine and crop failures, were ignored or hidden, making it hard to understand the true state of the country.
SaaS Engineering β€’ 39 implied HN points β€’ 02 Mar 23
  1. Averages like mean, median, and mode help us summarize and understand groups of data.
  2. Using the correct type of average is important to accurately represent the data, like using median for ranking or mode for most common occurrences.
  3. In scenarios like evaluating investment portfolios, understanding the median progress and how it relates to the future mean outcome is crucial for decision-making.
Magid and Co β€’ 39 implied HN points β€’ 24 Jul 23
  1. The post shares data on Series A deals done in the last week of July 2023.
  2. The summary stats provide information on Series A deals worldwide, excluding China, where the amount raised is over $5M, and the company is not focused on therapeutics.
  3. The post encourages readers to subscribe for free to receive new posts and support the author's work.
A Biologist's Guide to Life β€’ 58 implied HN points β€’ 23 Dec 24
  1. There are two main theories about the origin of SARS-CoV-2: one is that it came from animal trade, and the other is that it originated in a lab. Each theory has its own set of details that scientists are still investigating.
  2. Understanding the origins of the virus requires knowledge of both biology and complex statistical methods. These methods help researchers weigh the evidence carefully, which is crucial for arriving at the most likely explanation.
  3. The evidence increasingly suggests that the virus may have come from a lab, especially noting the features like the furin cleavage site that were put into a reverse genetic system. This raises important questions about how we study viruses and their potential risks.
A Piece of the Pi: mathematics explained β€’ 48 implied HN points β€’ 03 Feb 25
  1. Bottlenecks in networks are crucial points that can slow down communication or movement. Identifying these points helps understand how the entire network functions.
  2. Networks can be made up of different regions that are linked by these bottlenecks. Recognizing connections between these regions is important for overall analysis.
  3. Knowing where the bottlenecks are can help improve the efficiency of networks, whether in transportation or social connections. This can lead to better planning and resource allocation.
Pryor Questions β€’ 186 implied HN points β€’ 16 Sep 23
  1. The average number of sexual partners for men and women can vary depending on the type of average used, such as mean, median, or mode.
  2. Surveys on sexual partners may be influenced by social biases, leading to discrepancies in reported numbers between genders.
  3. Different studies and surveys show conflicting data on the average number of sexual partners for men and women, indicating the complexity of capturing such personal and varied experiences.
Mindful Modeler β€’ 59 implied HN points β€’ 14 Feb 23
  1. Conformal prediction can be combined with any uncertainty quantification method you already use, making it versatile and not restrictive.
  2. Conformal prediction is model-agnostic, meaning you can implement it without changing your existing models or user interface.
  3. One of the key advantages of conformal prediction is its guarantee of the true outcome coverage, making it a practical and useful addition to predictive modeling.
inexactscience β€’ 39 implied HN points β€’ 15 Jul 23
  1. Elo ratings are used to compare the strength of players, particularly in chess. They help predict the outcome of games based on the players' ratings.
  2. The formula for updating Elo ratings takes into account the expected score of a player and the actual outcome of a game. If the outcome is surprising, the rating changes more significantly.
  3. Elo ratings can also be applied beyond chess to other areas, like ranking items or comparing performance in various fields, showing their versatility as a simple yet effective system.
A Piece of the Pi: mathematics explained β€’ 36 implied HN points β€’ 21 Feb 25
  1. Dimer tilings involve arranging domino-shaped pieces on grids, and how many ways you can arrange them can vary based on the layout. For example, on a 3x3 grid with one space empty, there are 18 different arrangements.
  2. If at least one dimension of a rectangle is even, it's possible to cover it completely with dimers. However, if both dimensions are odd, it's impossible to cover them without leaving gaps.
  3. There are mathematical patterns and theorems, like Gomory's Theorem, that help understand how to tile grids with dimers. These principles can show when tiling is possible based on the arrangement and color of squares.
startupdreams β€’ 105 implied HN points β€’ 08 Mar 24
  1. BLS job numbers are consistently revised downward after initial high estimates, indicating potential inaccuracies in reporting.
  2. Comparison between BLS and ADP job reports show contrasting trends in new job creation, causing skepticism in the accuracy of BLS data.
  3. Analysis of full-time and part-time job numbers over a year reveals concerning trends, like growth in part-time jobs rather than full-time jobs.
Cremieux Recueil β€’ 138 implied HN points β€’ 05 Oct 23
  1. Italy is facing challenges with a wave of migrants, with significant numbers arriving.
  2. Different regions in Italy are receiving migrants at varying rates, with the south taking in fewer.
  3. The number of unaccompanied minor migrants has been substantial, and projections suggest a continuing increase in migrant arrivals.
Splattern β€’ 19 implied HN points β€’ 09 Dec 23
  1. 54% of Americans aged 16 to 74 read below a 6th grade level. This shows a big gap in literacy skills that affects people's ability to understand important documents.
  2. In 2023, there were almost 2.5 million encounters at the US-Mexico border. More migrants are coming from Central and South America than ever before.
  3. 70% of Jewish students at MIT feel they have to hide who they are due to fear. There's a lot of tension on campus, and it raises questions about how universities are handling such issues.
The Palindrome β€’ 4 implied HN points β€’ 11 Nov 25
  1. Using real data helps you understand the real-world quirks and problems that simulations can't show. It's like learning to drive in a car instead of a video game.
  2. Real data can reveal hidden patterns and insights about how things work, giving you a better chance to discover new information.
  3. Cleaning and transforming your data is crucial for accurate analysis. You need to tackle issues like outliers and non-normal distributions to get reliable results.
Technology Made Simple β€’ 39 implied HN points β€’ 06 Dec 22
  1. Understanding the Bias-Variance Tradeoff is crucial in Data Science and Machine Learning.
  2. Bias in a Machine Learning Model refers to prediction errors, while Variance accounts for the spread in predictions.
  3. High Bias can lead to underfitting, where the model doesn't grasp the data pattern fully, while High Variance can result in overfitting, where the model learns noise in the data.