The hottest Data Analysis Substack posts right now

And their main takeaways
Category
Top Technology Topics
Artificial Ignorance 25 implied HN points 06 Mar 25
  1. Several new advanced AI models have been released recently, improving reasoning and knowledge. These models, like OpenAI's GPT-4.5 and Google's Gemini 2.0, excel in different areas.
  2. AI is becoming more interactive with features that let it browse the web and perform tasks for users. This shows a shift towards AI that can take action, not just chat.
  3. The best AI models now cost more, with some requiring premium subscriptions. While powerful models like GPT-4.5 have high access fees, other new features may be available for free with some limits.
HackerNews blogs newsletter 59 implied HN points 02 Nov 24
  1. Measuring technical debt is crucial for leaders, especially CTOs. It helps in understanding and managing the challenges in software development.
  2. Freezing CEO salaries during layoffs can create a fairer work environment. It shows accountability and may protect jobs for regular employees.
  3. Life shouldn't solely be based on statistics. Everyone's experiences are unique and can't be fully represented by numbers.
Brad DeLong's Grasping Reality 130 implied HN points 24 Jun 25
  1. Big technology changes, like AI, often take longer to have an impact than we expect. History shows that these changes usually happen in small steps instead of all at once.
  2. The way AI is being used in businesses is growing, with more companies starting to adopt these technologies. This can lead to higher productivity over time.
  3. To really benefit from new technologies like AI, we need patience and creativity in our systems. The changes won't happen overnight, but it's important to stick with it.
arg min 436 implied HN points 24 Oct 24
  1. Statistical tests are designed to help separate real signals from random noise. It's not just about understanding what they mean, but what they can do in practical situations.
  2. Many people misuse statistical tests, which can lead to misunderstandings about their purpose. Communities should establish clear guidelines on how to use these tests correctly.
  3. The main function of statistical tests is to regulate opinions and decisions in various fields like tech and medicine. They help ensure that important standards are met, rather than just preventing errors.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Ground Truths 15921 implied HN points 14 Dec 24
  1. Your individual lab results, like the Complete Blood Count (CBC), can vary a lot between people but stay stable for you over time. This means your personal health data can give more accurate insights than just average values used for everyone.
  2. Personalized reference values from CBC tests can help predict health risks better than conventional methods. They show clearer connections to potential diseases and can indicate specific health issues.
  3. Using advanced technology like AI to analyze these personal health metrics could help doctors spot risks early. This approach can enhance patient care by identifying high-risk individuals for proactive health management.
Silver Bulletin 373 implied HN points 17 Feb 25
  1. The latest pollster ratings show which pollsters are most accurate and transparent based on their past performances. This helps understand which ones might do well in future elections.
  2. New data added to the ratings includes results from the 2024 presidential, congressional, and gubernatorial elections. Lots of new polls have shifted some ratings, but the top pollsters generally stayed the same.
  3. They measure pollster accuracy using different ratings and scores that consider factors like bias toward political parties and how close their predictions were to actual results.
arg min 734 implied HN points 14 Oct 24
  1. Statistics should help us test claims by measuring how surprising the results are. However, there's doubt about whether our current statistical tests actually do this well.
  2. Randomized trials are important because they help us learn about treatments that may not always work. They focus on safety as much as they do on finding effective solutions.
  3. The field of statistics needs to be clear about its purpose. We should distinguish between using statistics for proving theories and for practical decision-making like quality control.
Silver Bulletin 312 implied HN points 17 Feb 25
  1. Polls in 2024 had a lower average error than in previous years, which shows improvement in their accuracy. However, most polls underestimated Republican candidates, particularly Trump.
  2. There was a consistent bias in polls, leaning towards Democrats over the past three elections. This trend is concerning as it suggests a systematic issue with polling methods.
  3. Polling accuracy in calling election winners was lower in 2024 compared to past years. Close races should be seen as uncertain, and small leads in polls don't mean much.
arg min 634 implied HN points 10 Oct 24
  1. Statistics often involves optimizing methods to get the best results. Many statistical techniques can actually be viewed as optimization problems.
  2. Choosing a statistical method isn't just about the math—it's also based on beliefs about reality. This philosophical side is important but often overlooked.
  3. There's a danger in relying too much on tools and models we can solve. Sometimes, we force the data to fit our preferred methods instead of being open to the actual complexities.
Marcus on AI 13161 implied HN points 04 Feb 25
  1. ChatGPT still has major reliability issues, often providing incomplete or incorrect information, like missing U.S. states in tables.
  2. Despite being advanced, AI can still make basic mistakes, such as counting vowels incorrectly or misunderstanding simple tasks.
  3. Many claims about rapid progress in AI may be overstated, as even simple functions like creating tables can lead to errors.
ASeq Newsletter 7 implied HN points 28 Feb 25
  1. Roche's Q39 accuracy system is different from other platforms like Illumina and Oxford Nanopore. It's important to compare them carefully as each has unique metrics.
  2. The average accuracy of different sequencing platforms varies, but Roche doesn't provide clear comparisons. They share limited data about their simplex accuracy.
  3. Understanding the differences in data quality and error rates across platforms is crucial. Factors like read length and error filtering play a significant role in the accuracy of sequencing results.
Handy AI 19 implied HN points 29 Oct 24
  1. ChatGPT performed better in analyzing a Spotify dataset, providing accurate insights without errors, and displaying clear visualizations.
  2. Claude encountered issues with text extraction and made mistakes in data interpretation, like incorrectly assigning genre labels where they didn't exist in the dataset.
  3. Overall, ChatGPT offered a smoother user experience, allowing users to follow along with the analysis while Claude's process was less straightforward.
Gonzo ML 252 implied HN points 06 Feb 25
  1. DeepSeek-V3 uses a new technique called Multi-head Latent Attention, which helps to save memory and speed up processing by compressing data more efficiently. This means it can handle larger datasets faster.
  2. The model incorporates an innovative approach called Multi-Token Prediction, allowing it to predict multiple tokens at once. This can improve its understanding of context and boost overall performance.
  3. DeepSeek-V3 is trained using advanced hardware and new training techniques, including utilizing FP8 precision. This helps in reducing costs and increasing efficiency while still maintaining model quality.
beyondrevenueoperations 19 implied HN points 27 Oct 24
  1. Combining SQL and Python makes data management much easier. SQL helps you access and pull data, while Python helps analyze it and create reports.
  2. Using SQL, you can break down data silos from different systems to get a complete view of your customers and performance. This is crucial for making smart, data-driven decisions.
  3. With Python, you can automate tasks, build predictive models, and visualize data, which saves time and enhances your ability to understand trends and insights.
Open Source Defense 28 implied HN points 17 Jun 25
  1. Sensors help us understand and measure things better. The more accurate our sensors are, the more we can improve our products and practices.
  2. In different fields, the use of sensors is at various stages. Some areas, like competition shooting, are advanced, while others, like non-lethal weapons, have much room for growth.
  3. Using objective measurements can change our understanding of different situations. By having clear data, we can make better decisions and improve our overall knowledge.
Am I Stronger Yet? 282 implied HN points 30 Jan 25
  1. DeepSeek's new AI model, r1, shows impressive reasoning abilities, challenging larger competitors despite its smaller budget and team. It proves that smaller companies can contribute significantly to AI advancements.
  2. The cost of training r1 was much lower than similar models, potentially signaling a shift in how AI models might be developed and run in the future. This could allow more organizations to participate in AI development without needing huge budgets.
  3. DeepSeek's approach, including releasing its model weights for public use, opens up the possibility for further research and innovation. This could change the landscape of AI by making powerful tools more accessible to everyone.
Erdmann Housing Tracker 231 implied HN points 03 Feb 25
  1. There is a significant shortage of homes in the U.S., estimated at around 15 million. This is due to various factors like vacancies and the rising number of adults per home.
  2. Vacancies have dropped over the years, and we might be short about 5 million vacant units needed to keep rent inflation stable.
  3. Population growth has slowed since 2008 and has likely affected housing demand, which adds pressure to the existing housing shortage.
Marcus on AI 4228 implied HN points 27 Jan 25
  1. Nvidia's stock might be facing a big drop, which is a concern for investors. A decline over 10% indicates that something is going on in the market.
  2. The market can behave in unpredictable ways, and this uncertainty can be tough for investors to manage. Today might be a key moment in the stock market.
  3. Overall, the economics of generative AI can lead to unexpected changes, making it a wild area to watch for investors and tech enthusiasts.
Wood From Eden 1344 implied HN points 04 Dec 24
  1. Psychiatry has a problem with labels. Many old labels have been removed without clear replacements, making research and understanding harder.
  2. Using numbers instead of words could help describe a person's mental health better. A barcode-like system could show traits and abilities at a glance.
  3. Psychology is subjective and changes over time. Collecting more data through tests can help improve understanding and research in mental health.
Sustainability by numbers 211 implied HN points 27 Jan 25
  1. In 2024, fewer people died from disasters compared to previous years, thanks to fewer major earthquakes. The estimate was around 9,500 deaths, which is low compared to the high averages from past years.
  2. Floods, wildfires, and storms were the main causes of deaths in 2024. Many fatalities came from extreme weather events, particularly flooding in Africa and wildfires in South America.
  3. It's important to note that data on disaster deaths is often incomplete, especially for temperature-related deaths. Researchers have to estimate these numbers, leading to less reliable statistics overall.
The Honest Broker Newsletter 2973 implied HN points 27 Jan 25
  1. In 2024, there were a lot of major hurricanes, tying with 2015 for the highest since records began, which raises questions about climate patterns.
  2. Despite the increase in hurricane landfalls, there hasn't been a clear trend showing that hurricanes are becoming more intense or frequent over time.
  3. Experts believe that while human activity may influence hurricanes, detecting these changes amidst natural variability is very challenging.
Steve Kirsch's newsletter 9 implied HN points 11 Jun 25
  1. Time series graphs can show if a vaccine is safe or not by plotting daily deaths after vaccination. A safe vaccine should show a flat line after the initial period.
  2. Current data for COVID vaccines shows increasing mortality rates after vaccination, which suggests they may not be safe. Many reports don’t show this data.
  3. The medical community often ignores clear signs of vaccine risks, despite evidence appearing in graphs and reports, leading to frustration among those who analyze the data.
Marcus on AI 4663 implied HN points 24 Nov 24
  1. Scaling laws in AI aren't as reliable as people once thought. They're more like general ideas that can change, rather than hard rules.
  2. The new approach to scaling, which focuses on how long you train a model, can be costly and doesn't always work better for all problems.
  3. Instead of just trying to make existing models bigger or longer-lasting, the field needs fresh ideas and innovations to improve AI.
RESCUE with Michael Capuzzo 9787 implied HN points 08 Jun 23
  1. John Berndsen's heart complications after receiving the Pfizer vaccine illustrate a potential link to myocarditis and the importance of questioning vaccine safety.
  2. Many adverse reactions to COVID-19 vaccines are not being reported in the media, and the numbers show a significant impact on health, including deaths.
  3. John Berndsen's experience highlights the importance of critically examining the safety and necessity of additional vaccine doses, especially for vulnerable individuals.
Silver Bulletin 214 implied HN points 16 Jan 25
  1. Polling accuracy is becoming less predictable and more nuanced. Pollsters are feeling cautiously optimistic this time, although mistakes still happened in predicting election outcomes.
  2. Pollsters are likely to stick with their current methods for 2026. Many have already adapted and believe the changes they've made are effective enough for now.
  3. There is no single best way to conduct polls anymore. Different methods and tech are used by different polling organizations, which can lead to varied results.
Tim Culpan’s Position 119 implied HN points 05 Sep 24
  1. TSMC and Intel are two major players in the semiconductor industry. Their performance and strategies have crucial implications for technology.
  2. Visual data can highlight important differences in the technical and financial health of these companies. Charts can make complex information easier to understand.
  3. Recent reports show that Intel is facing significant challenges, while TSMC continues to lead in production and technology advancements. This could shape the future of the tech industry.
Encyclopedia Autonomica 39 implied HN points 13 Oct 24
  1. Transformers use a specific structure for commands called JSON. This makes it easier to describe actions clearly and effectively.
  2. The system prompt includes rules that the agent must follow, like focusing on one action at a time and using the correct values for inputs.
  3. The design also emphasizes iterative reasoning, where the agent can build on previous observations to make better decisions in tasks.
Richard Hanania's Newsletter 3657 implied HN points 07 Oct 24
  1. Many people incorrectly believe that immigration leads to higher crime rates. In reality, data shows that most immigrants, especially legal ones, tend to commit less crime than native-born citizens.
  2. Some politicians use scary language about immigrants increasing crime to push their agenda. This can create a false narrative that makes the public fearful and misinformed about the actual impact of immigration.
  3. Immigrants often face more crime themselves and can actually help reduce crime rates in communities by starting businesses and contributing to the economy. So, they can serve as a buffer against crime rather than a cause of it.
Software Design: Tidy First? 1347 implied HN points 27 Jan 25
  1. Data can provide hints about a programmer's influence, but it can't give a clear answer. It's important to interpret the data with caution and avoid making strict decisions based solely on it.
  2. Creating files is one way to measure initiation of influence, but it's not the only factor. The impact is also determined by how frequently those files are modified by others.
  3. Using data for bonuses or promotions can lead to problems. It's better to focus on improvement and impact rather than just the numbers, to maintain a healthy team dynamic.
Astral Codex Ten 8534 implied HN points 05 Mar 24
  1. The Annual Forecasting Contest on astralcodexten.com involves participants making predictions about various questions, helping to determine if one identifiable genius or aggregated mathematical predictions work best for foreseeing the future.
  2. The winners of the contest were both amateurs and seasoned forecasting veterans, showcasing a mix of skill and luck in predicting outcomes.
  3. Metaculus outperformed prediction markets, superforecasters, and the wisdom of crowds in the contest, suggesting that consistent high performance might be rare but achievable with specific methods like those used by superforecaster Ezra Karger.
Klement on Investing 4 implied HN points 20 Jun 25
  1. On average, women speak more words per day than men. Women use about 13,349 words while men use around 11,950 words daily.
  2. As people age, how much they talk can change. Younger men and women talk similarly, but older men often become more talkative than older women.
  3. Some people barely talk, while others can speak a ton, like 50,000 words a day. It's interesting to see such a big range in how much different people communicate.
Gonzo ML 63 implied HN points 29 Jan 25
  1. The paper introduces a method called ACDC that automates the process of finding important circuits in neural networks. This can help us better understand how these networks work.
  2. Researchers follow a three-step workflow to study model behavior, and ACDC fully automates the last step which helps identify connections that matter for a specific task.
  3. While ACDC shows promise, it isn't perfect. It may miss some important connections and needs adjustments for different tasks to improve its accuracy.
Public Universal Friend 79 implied HN points 02 Sep 24
  1. Using a customer engagement platform like Customer.io can help marketers improve their targeting and maximize growth. It offers better data management and less need for technical support.
  2. Spring is a great time for businesses to focus on improving conversions through digital marketing strategies. Real-time data can help companies get more return on their investment.
  3. Personal connections and genuine interactions are valuable, even in business communication. Taking the time to show real interest can make a difference.
Independent SAGE continues 1418 implied HN points 20 Mar 24
  1. Independent SAGE has launched a Substack to share insights about Covid research and data. They aim to provide valuable information directly from experts to the public.
  2. They plan to post updates roughly every two weeks, including responses to important new research and news. This helps keep everyone informed about the ongoing situation.
  3. The Substack will remain free for subscribers, encouraging more people to stay updated on Covid developments and public health measures.
Engineering Enablement 21 implied HN points 05 Feb 25
  1. Metrics for developers should help improve their work experience, not just measure their output. Goodhart's Law reminds us that once metrics are tied to rewards, they can become misleading.
  2. Developer experience is more about effectiveness than happiness. Measuring how developers feel needs to focus on the frustrations they face, and not just on making them comfortable.
  3. Using benchmarks is important but context is key. Just like medical tests, numbers need interpretation to make sense; comparing different teams requires understanding their unique challenges.
Ground Truths 3980 implied HN points 19 Feb 24
  1. Polygenic risk scores can provide valuable information on high genetic risk for diseases like heart disease and cancer, beyond traditional clinical risk factors.
  2. The use of polygenic risk scores is advancing thanks to efforts like the eMERGE consortium, incorporating multi-ancestry data and rigorous validation.
  3. Actionable polygenic risk scores have the potential to reduce health disparities and enhance preventive strategies in medical practice.