The hottest Data Analysis Substack posts right now

And their main takeaways

Introducing the Model Memo

Artificial Ignorance • 25 implied HN points • 06 Mar 25

Several new advanced AI models have been released recently, improving reasoning and knowledge. These models, like OpenAI's GPT-4.5 and Google's Gemini 2.0, excel in different areas.
AI is becoming more interactive with features that let it browse the web and perform tasks for users. This shows a shift towards AI that can take action, not just chat.
The best AI models now cost more, with some requiring premium subscriptions. While powerful models like GPT-4.5 have high access fees, other new features may be available for free with some limits.

HN blogs - 1/11/24

HackerNews blogs newsletter • 59 implied HN points • 02 Nov 24

🕹 Technology Software Engineering Data Analysis Cybersecurity Programming

Measuring technical debt is crucial for leaders, especially CTOs. It helps in understanding and managing the challenges in software development.
Freezing CEO salaries during layoffs can create a fairer work environment. It shows accountability and may protect jobs for regular employees.
Life shouldn't solely be based on statistics. Everyone's experiences are unique and can't be fully represented by numbers.

MAMLM as a General Purpose Technology: The Ghost in the GDP Machine

Brad DeLong's Grasping Reality • 130 implied HN points • 24 Jun 25

🕹 Technology Artificial Intelligence Innovation Productivity Data Analysis Economic Impact

Big technology changes, like AI, often take longer to have an impact than we expect. History shows that these changes usually happen in small steps instead of all at once.
The way AI is being used in businesses is growing, with more companies starting to adopt these technologies. This can lead to higher productivity over time.
To really benefit from new technologies like AI, we need patience and creativity in our systems. The changes won't happen overnight, but it's important to stick with it.

BREAKING: The race is OFFICIALLY tied!

COVID Reason • 456 implied HN points • 25 Oct 24

🇺🇸 U.S. Politics Elections Polling Campaigns Public Opinion Data Analysis

The race between Harris and Trump is officially tied, with both having equal support in recent polls.
Polls show that results can vary slightly in different states but overall it's a close competition.
As the election approaches, these numbers highlight a very competitive environment for both candidates.

A use-theory of testing

arg min • 436 implied HN points • 24 Oct 24

🔬 Science Statistics Research Mathematics Data Analysis Theory

Statistical tests are designed to help separate real signals from random noise. It's not just about understanding what they mean, but what they can do in practical situations.
Many people misuse statistical tests, which can lead to misunderstandings about their purpose. Communities should establish clear guidelines on how to use these tests correctly.
The main function of statistical tests is to regulate opinions and decisions in various fields like tech and medicine. They help ensure that important standards are met, rather than just preventing errors.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Your Lab Tests

Ground Truths • 15921 implied HN points • 14 Dec 24

🏥 Health & Wellness Lab Tests Data Analysis

Your individual lab results, like the Complete Blood Count (CBC), can vary a lot between people but stay stable for you over time. This means your personal health data can give more accurate insights than just average values used for everyone.
Personalized reference values from CBC tests can help predict health risks better than conventional methods. They show clearer connections to potential diseases and can indicate specific health issues.
Using advanced technology like AI to analyze these personal health metrics could help doctors spot risks early. This approach can enhance patient care by identifying high-risk individuals for proactive health management.

Silver Bulletin pollster ratings, 2025 update

Silver Bulletin • 373 implied HN points • 17 Feb 25

🇺🇸 U.S. Politics Polling Elections Data Analysis Political trends Political Reporting

The latest pollster ratings show which pollsters are most accurate and transparent based on their past performances. This helps understand which ones might do well in future elections.
New data added to the ratings includes results from the 2024 presidential, congressional, and gubernatorial elections. Lots of new polls have shifted some ratings, but the top pollsters generally stayed the same.
They measure pollster accuracy using different ratings and scores that consider factors like bias toward political parties and how close their predictions were to actual results.

What Is Statistics' Purpose?

arg min • 734 implied HN points • 14 Oct 24

🔬 Science Statistics Research Medicine Philosophy Data Analysis

Statistics should help us test claims by measuring how surprising the results are. However, there's doubt about whether our current statistical tests actually do this well.
Randomized trials are important because they help us learn about treatments that may not always work. They focus on safety as much as they do on finding effective solutions.
The field of statistics needs to be clear about its purpose. We should distinguish between using statistics for proving theories and for practical decision-making like quality control.

How did the polls do in 2024? It’s complicated.

Silver Bulletin • 312 implied HN points • 17 Feb 25

🇺🇸 U.S. Politics Polling Elections Public Opinion Data Analysis Political Bias

Polls in 2024 had a lower average error than in previous years, which shows improvement in their accuracy. However, most polls underestimated Republican candidates, particularly Trump.
There was a consistent bias in polls, leaning towards Democrats over the past three elections. This trend is concerning as it suggests a systematic issue with polling methods.
Polling accuracy in calling election winners was lower in 2024 compared to past years. Close races should be seen as uncertain, and small leads in polls don't mean much.

The Shape of Stats to Come

arg min • 634 implied HN points • 10 Oct 24

🚌 Education Statistics Optimization Philosophy Data Analysis Mathematics

Statistics often involves optimizing methods to get the best results. Many statistical techniques can actually be viewed as optimization problems.
Choosing a statistical method isn't just about the math—it's also based on beliefs about reality. This philosophical side is important but often overlooked.
There's a danger in relying too much on tools and models we can solve. Sometimes, we force the data to fit our preferred methods instead of being open to the actual complexities.

ChatGPT in Shambles

Marcus on AI • 13161 implied HN points • 04 Feb 25

🕹 Technology Artificial Intelligence Machine Learning Natural Language Processing Data Analysis Software Development

ChatGPT still has major reliability issues, often providing incomplete or incorrect information, like missing U.S. states in tables.
Despite being advanced, AI can still make basic mistakes, such as counting vowels incorrectly or misunderstanding simple tasks.
Many claims about rapid progress in AI may be overstated, as even simple functions like creating tables can lead to errors.

Roche Nanopore: Accuracy

ASeq Newsletter • 7 implied HN points • 28 Feb 25

🔬 Science Biotechnology Genomics Data Analysis Medical Research

Roche's Q39 accuracy system is different from other platforms like Illumina and Oxford Nanopore. It's important to compare them carefully as each has unique metrics.
The average accuracy of different sequencing platforms varies, but Roche doesn't provide clear comparisons. They share limited data about their simplex accuracy.
Understanding the differences in data quality and error rates across platforms is crucial. Factors like read length and error filtering play a significant role in the accuracy of sequencing results.

AI analysis: ChatGPT vs Claude

Handy AI • 19 implied HN points • 29 Oct 24

🕹 Technology AI Data Analysis Machine Learning Software Development Information Technology

ChatGPT performed better in analyzing a Spotify dataset, providing accurate insights without errors, and displaying clear visualizations.
Claude encountered issues with text extraction and made mistakes in data interpretation, like incorrectly assigning genre labels where they didn't exist in the dataset.
Overall, ChatGPT offered a smoother user experience, allowing users to follow along with the analysis while Claude's process was less straightforward.

DeepSeek-V3: Technical Details

Gonzo ML • 252 implied HN points • 06 Feb 25

🕹 Technology Artificial Intelligence Machine Learning Computer Science Data Analysis Software Development

DeepSeek-V3 uses a new technique called Multi-head Latent Attention, which helps to save memory and speed up processing by compressing data more efficiently. This means it can handle larger datasets faster.
The model incorporates an innovative approach called Multi-Token Prediction, allowing it to predict multiple tokens at once. This can improve its understanding of context and boost overall performance.
DeepSeek-V3 is trained using advanced hardware and new training techniques, including utilizing FP8 precision. This helps in reducing costs and increasing efficiency while still maintaining model quality.

SQL and Python Combined Change RevOps

beyondrevenueoperations • 19 implied HN points • 27 Oct 24

💼 Business Data Analysis Automation Technology Integration Strategic Planning

Combining SQL and Python makes data management much easier. SQL helps you access and pull data, while Python helps analyze it and create reports.
Using SQL, you can break down data silos from different systems to get a complete view of your customers and performance. This is crucial for making smart, data-driven decisions.
With Python, you can automate tasks, build predictive models, and visualize data, which saves time and enhances your ability to understand trends and insights.

OSD 330: The rise of sensorship

Open Source Defense • 28 implied HN points • 17 Jun 25

🕹 Technology Sensors Innovation Gadgets Measurement Data Analysis

Sensors help us understand and measure things better. The more accurate our sensors are, the more we can improve our products and practices.
In different fields, the use of sensors is at various stages. Some areas, like competition shooting, are advanced, while others, like non-lethal weapons, have much room for growth.
Using objective measurements can change our understanding of different situations. By having clear data, we can make better decisions and improve our overall knowledge.

China’s DeepSeek Adds a Weird New Data Point to The AI Race

Am I Stronger Yet? • 282 implied HN points • 30 Jan 25

🕹 Technology AI Models Machine Learning Data Analysis AI Research Competitor Analysis

DeepSeek's new AI model, r1, shows impressive reasoning abilities, challenging larger competitors despite its smaller budget and team. It proves that smaller companies can contribute significantly to AI advancements.
The cost of training r1 was much lower than similar models, potentially signaling a shift in how AI models might be developed and run in the future. This could allow more organizations to participate in AI development without needing huge budgets.
DeepSeek's approach, including releasing its model weights for public use, opens up the possibility for further research and innovation. This could change the landscape of AI by making powerful tools more accessible to everyone.

What Homes Are Missing, again?

Erdmann Housing Tracker • 231 implied HN points • 03 Feb 25

💼 Business Housing Economics Data Analysis Market Trends Population Growth

There is a significant shortage of homes in the U.S., estimated at around 15 million. This is due to various factors like vacancies and the rising number of adults per home.
Vacancies have dropped over the years, and we might be short about 5 million vacant units needed to keep rent inflation stable.
Population growth has slowed since 2008 and has likely affected housing demand, which adds pressure to the existing housing shortage.

“Nvidia could soon take a serious hit, too”

Marcus on AI • 4228 implied HN points • 27 Jan 25

🕹 Technology AI Market Trends Economics Tech Companies Data Analysis

Nvidia's stock might be facing a big drop, which is a concern for investors. A decline over 10% indicates that something is going on in the market.
The market can behave in unpredictable ways, and this uncertainty can be tough for investors to manage. Today might be a key moment in the stock market.
Overall, the economics of generative AI can lead to unexpected changes, making it a wild area to watch for investors and tech enthusiasts.

When will psychiatry invent the barcode?

Wood From Eden • 1344 implied HN points • 04 Dec 24

🏥 Health Politics Mental health Psychiatry Diagnosis Research Data Analysis

Psychiatry has a problem with labels. Many old labels have been removed without clear replacements, making research and understanding harder.
Using numbers instead of words could help describe a person's mental health better. A barcode-like system could show traits and abilities at a glance.
Psychology is subjective and changes over time. Collecting more data through tests can help improve understanding and research in mental health.

How many people died in disasters in 2024?

Sustainability by numbers • 211 implied HN points • 27 Jan 25

🌞 Climate & Environment Disasters Sustainability Data Analysis Public Health Environmental Impact

In 2024, fewer people died from disasters compared to previous years, thanks to fewer major earthquakes. The estimate was around 9,500 deaths, which is low compared to the high averages from past years.
Floods, wildfires, and storms were the main causes of deaths in 2024. Many fatalities came from extreme weather events, particularly flooding in Africa and wildfires in South America.
It's important to note that data on disaster deaths is often incomplete, especially for temperature-related deaths. Researchers have to estimate these numbers, leading to less reliable statistics overall.

The Most Major Hurricanes Ever

The Honest Broker Newsletter • 2973 implied HN points • 27 Jan 25

🌞 Climate & Environment Climate change Weather Patterns Natural Disasters Data Analysis

In 2024, there were a lot of major hurricanes, tying with 2015 for the highest since records began, which raises questions about climate patterns.
Despite the increase in hurricane landfalls, there hasn't been a clear trend showing that hurricanes are becoming more intense or frequent over time.
Experts believe that while human activity may influence hurricanes, detecting these changes amidst natural variability is very challenging.

COVID time series graphs show clearly the COVID vaccine kill people. That's why they keep the plots hidden from view.

Steve Kirsch's newsletter • 9 implied HN points • 11 Jun 25

🏥 Health Politics Public Health Vaccines Safety Data Analysis Medical Ethics

Time series graphs can show if a vaccine is safe or not by plotting daily deaths after vaccination. A safe vaccine should show a flat line after the initial period.
Current data for COVID vaccines shows increasing mortality rates after vaccination, which suggests they may not be safe. Many reports don’t show this data.
The medical community often ignores clear signs of vaccine risks, despite evidence appearing in graphs and reports, leading to frustration among those who analyze the data.

The new AI scaling law shell game

Marcus on AI • 4663 implied HN points • 24 Nov 24

🕹 Technology Artificial Intelligence Machine Learning Computing Data Analysis

Scaling laws in AI aren't as reliable as people once thought. They're more like general ideas that can change, rather than hard rules.
The new approach to scaling, which focuses on how long you train a model, can be costly and doesn't always work better for all problems.
Instead of just trying to make existing models bigger or longer-lasting, the field needs fresh ideas and innovations to improve AI.

A Pfizer shot. A failed heart. A transplant. Get vaxxed again?

RESCUE with Michael Capuzzo • 9787 implied HN points • 08 Jun 23

🏥 Health Politics Vaccination Data Analysis

John Berndsen's heart complications after receiving the Pfizer vaccine illustrate a potential link to myocarditis and the importance of questioning vaccine safety.
Many adverse reactions to COVID-19 vaccines are not being reported in the media, and the numbers show a significant impact on health, including deaths.
John Berndsen's experience highlights the importance of critically examining the safety and necessity of additional vaccine doses, especially for vulnerable individuals.

Polling is becoming more of an art than a science

Silver Bulletin • 214 implied HN points • 16 Jan 25

🇺🇸 U.S. Politics Polling Political trends Voter Behavior Data Analysis

Polling accuracy is becoming less predictable and more nuanced. Pollsters are feeling cautiously optimistic this time, although mistakes still happened in predicting election outcomes.
Pollsters are likely to stick with their current methods for 2026. Many have already adapted and believe the changes they've made are effective enough for now.
There is no single best way to conduct polls anymore. Different methods and tech are used by different polling organizations, which can lead to varied results.

TSMC v Intel in Two Charts

Tim Culpan’s Position • 119 implied HN points • 05 Sep 24

🕹 Technology Semiconductors Charts Data Analysis Market Trends

TSMC and Intel are two major players in the semiconductor industry. Their performance and strategies have crucial implications for technology.
Visual data can highlight important differences in the technical and financial health of these companies. Charts can make complex information easier to understand.
Recent reports show that Intel is facing significant challenges, while TSMC continues to lead in production and technology advancements. This could shape the future of the tech industry.

Deconstructing the Transformers ReAct JSON System Prompt

Encyclopedia Autonomica • 39 implied HN points • 13 Oct 24

🕹 Technology AI Software Machine Learning Data Analysis Development

Transformers use a specific structure for commands called JSON. This makes it easier to describe actions clearly and effectively.
The system prompt includes rules that the agent must follow, like focusing on one action at a time and using the correct values for inputs.
The design also emphasizes iterative reasoning, where the agent can build on previous observations to make better decisions in tasks.

Conservatives Are Lying on Immigrant Crime

Richard Hanania's Newsletter • 3657 implied HN points • 07 Oct 24

🇺🇸 U.S. Politics Immigration Policy Crime Rates Public Discourse Political Rhetoric Data Analysis

Many people incorrectly believe that immigration leads to higher crime rates. In reality, data shows that most immigrants, especially legal ones, tend to commit less crime than native-born citizens.
Some politicians use scary language about immigrants increasing crime to push their agenda. This can create a false narrative that makes the public fearful and misinformed about the actual impact of immigration.
Immigrants often face more crime themselves and can actually help reduce crime rates in communities by starting businesses and contributing to the economy. So, they can serve as a buffer against crime rather than a cause of it.

Measuring Programmer Influence, Kinda Sorta

Software Design: Tidy First? • 1347 implied HN points • 27 Jan 25

🕹 Technology Software Development Data Analysis Programming Project management

Data can provide hints about a programmer's influence, but it can't give a clear answer. It's important to interpret the data with caution and avoid making strict decisions based solely on it.
Creating files is one way to measure initiation of influence, but it's not the only factor. The impact is also determined by how frequently those files are modified by others.
Using data for bonuses or promotions can lead to problems. It's better to focus on improvement and impact rather than just the numbers, to maintain a healthy team dynamic.

How to use ChatGPT in your PM work

Lenny's Newsletter • 5837 implied HN points • 11 Apr 23

🕹 Technology Product Management Artificial Intelligence Innovation Tools Data Analysis

Learning to work alongside AI will become necessary for knowledge work.
ChatGPT can be used for tasks like summarizing user feedback and coming up with product name suggestions.
Leveraging ChatGPT can help in strengthening arguments and inspiring roadmap ideas for product management.

Who Predicted 2023?

Astral Codex Ten • 8534 implied HN points • 05 Mar 24

🕹 Technology Forecasting Prediction Contest Algorithm Data Analysis

The Annual Forecasting Contest on astralcodexten.com involves participants making predictions about various questions, helping to determine if one identifiable genius or aggregated mathematical predictions work best for foreseeing the future.
The winners of the contest were both amateurs and seasoned forecasting veterans, showcasing a mix of skill and luck in predicting outcomes.
Metaculus outperformed prediction markets, superforecasters, and the wisdom of crowds in the contest, suggesting that consistent high performance might be rare but achievable with specific methods like those used by superforecaster Ezra Karger.

Do Masks Work?

Uncharted Territories • 5149 implied HN points • 28 Feb 23

🏥 Health & Wellness Public Health Research Masks Infections Data Analysis

The debate around mask efficacy is contentious and the science is complex.
Properly worn masks can reduce infection rates, especially when used in community settings.
Some studies in the meta-analysis may have been weighted inaccurately, resulting in misleading conclusions.

STFU

Klement on Investing • 4 implied HN points • 20 Jun 25

😂 Humor Social Commentary Anecdotes Data Analysis Gender differences

On average, women speak more words per day than men. Women use about 13,349 words while men use around 11,950 words daily.
As people age, how much they talk can change. Younger men and women talk similarly, but older men often become more talkative than older women.
Some people barely talk, while others can speak a ton, like 50,000 words a day. It's interesting to see such a big range in how much different people communicate.

🤘ACDC (not that one)

Gonzo ML • 63 implied HN points • 29 Jan 25

🕹 Technology Artificial Intelligence Machine Learning Neural Networks Data Analysis Automation

The paper introduces a method called ACDC that automates the process of finding important circuits in neural networks. This can help us better understand how these networks work.
Researchers follow a three-step workflow to study model behavior, and ACDC fully automates the last step which helps identify connections that matter for a specific task.
While ACDC shows promise, it isn't perfect. It may miss some important connections and needs adjustments for different tasks to improve its accuracy.

Public Universal Friend • 79 implied HN points • 02 Sep 24

💼 Business Marketing Customer Engagement Data Analysis Growth Strategies

Using a customer engagement platform like Customer.io can help marketers improve their targeting and maximize growth. It offers better data management and less need for technical support.
Spring is a great time for businesses to focus on improving conversions through digital marketing strategies. Real-time data can help companies get more return on their investment.
Personal connections and genuine interactions are valuable, even in business communication. Taking the time to show real interest can make a difference.

Meta seeks to hide harms from teens

Platformer • 2476 implied HN points • 10 Jan 24

🕹 Technology Social media Artificial Intelligence Regulation Data Analysis Digital Advertising

Meta announced new measures to protect users under 18 from harmful content on its platforms.
There is a growing focus on child safety in social media regulations, shifting from speech-related issues.
Lawmakers and social networks need to find common ground to make real progress in improving teen mental health.

Independent SAGE has joined substack!

Independent SAGE continues • 1418 implied HN points • 20 Mar 24

🏥 Health Politics Public Health Pandemic response Vaccination Data Analysis

Independent SAGE has launched a Substack to share insights about Covid research and data. They aim to provide valuable information directly from experts to the public.
They plan to post updates roughly every two weeks, including responses to important new research and news. This helps keep everyone informed about the ongoing situation.
The Substack will remain free for subscribers, encouraging more people to stay updated on Covid developments and public health measures.

Measuring developer productivity: A clear-eyed view

Engineering Enablement • 21 implied HN points • 05 Feb 25

🕹 Technology Software Development Developer Experience Productivity Metrics Engineering Management Data Analysis

Metrics for developers should help improve their work experience, not just measure their output. Goodhart's Law reminds us that once metrics are tied to rewards, they can become misleading.
Developer experience is more about effectiveness than happiness. Measuring how developers feel needs to focus on the frustrations they face, and not just on making them comfortable.
Using benchmarks is important but context is key. Just like medical tests, numbers need interpretation to make sense; comparing different teams requires understanding their unique challenges.

Polygenic Risk Scores: Ready for Prime Time?

Ground Truths • 3980 implied HN points • 19 Feb 24

🏥 Health & Wellness Genetics Risk Assessment Prevention Data Analysis

Polygenic risk scores can provide valuable information on high genetic risk for diseases like heart disease and cancer, beyond traditional clinical risk factors.
The use of polygenic risk scores is advancing thanks to efforts like the eMERGE consortium, incorporating multi-ancestry data and rigorous validation.
Actionable polygenic risk scores have the potential to reduce health disparities and enhance preventive strategies in medical practice.