The hottest Data Analysis Substack posts right now

And their main takeaways
Category
Top Technology Topics
Sustainability by numbers 211 implied HN points 27 Jan 25
  1. In 2024, fewer people died from disasters compared to previous years, thanks to fewer major earthquakes. The estimate was around 9,500 deaths, which is low compared to the high averages from past years.
  2. Floods, wildfires, and storms were the main causes of deaths in 2024. Many fatalities came from extreme weather events, particularly flooding in Africa and wildfires in South America.
  3. It's important to note that data on disaster deaths is often incomplete, especially for temperature-related deaths. Researchers have to estimate these numbers, leading to less reliable statistics overall.
DeFi Education 1298 implied HN points 11 Jul 21
  1. There are major risks in DeFi farming like smart contract failures and rug pulls. It's important to be aware of these risks before investing.
  2. Fees can add up quickly when using DeFi projects, so timing your transactions wisely can help save money.
  3. Finding reliable data about DeFi projects is hard, and many sources might not give accurate information. It's crucial to do your own research before investing.
Silver Bulletin 214 implied HN points 16 Jan 25
  1. Polling accuracy is becoming less predictable and more nuanced. Pollsters are feeling cautiously optimistic this time, although mistakes still happened in predicting election outcomes.
  2. Pollsters are likely to stick with their current methods for 2026. Many have already adapted and believe the changes they've made are effective enough for now.
  3. There is no single best way to conduct polls anymore. Different methods and tech are used by different polling organizations, which can lead to varied results.
Technology Made Simple 159 implied HN points 08 Jul 23
  1. Understanding the difference between Vertical and Horizontal Integration is crucial in business. Horizontal Integration can offer leverage and streamline processes within an organization.
  2. Threads, Meta's new app, has the potential to tap into academic circles on Twitter by addressing its mobile-only flaw. This could change user engagement dynamics and impact monetization.
  3. Social media platforms like Threads can be powerful tools for controlling public discourse and information flow. Meta's investment in the Metaverse is seen as a strategic move for the future.
The Data Score 98 implied HN points 03 Jan 24
  1. Raw data is a cost; insights have value. The process of transforming raw data into insight-ready data is crucial for generating value.
  2. Assess the return on investment in data by considering how many decisions can be influenced and understanding the limitations of the data. Data that positively impacts decisions increases its value.
  3. Understand the cost of data investment, including sourcing, loading, and transforming data. Consider the ease of integrating data and the importance of insights generated over time.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
theconnector 157 implied HN points 16 May 23
  1. Generative AI technologies are causing significant changes in our society.
  2. Companies like Google and OpenAI are disrupting existing economic models with AI services.
  3. Public engagement and oversight are crucial in shaping the future of AI regulation.
Vincos Newsletter 157 implied HN points 28 Jul 23
  1. A new version of Stable Diffusione, SDXL 1.0, was released and tested through DreamStudio.
  2. Twitter's branding changes and Elon Musk's ambitious transformation plans are generating discussions.
  3. Netflix and Disney are seeking machine learning experts for content production as actors express concerns about being replaced by digital simulations.
sebjenseb 157 implied HN points 03 Jul 23
  1. The average IQ of rationalists may not be as high as self-reported values suggest, with estimates pointing to an average IQ between 125-130.
  2. Analysis of SAT and IQ scores of rationalists indicates an estimated average IQ of about 133.6 after accounting for biases.
  3. Educational attainment and plausible assumptions suggest the average IQ of internet rationalists is between 125-130, considering selection for educational attainment.
42 Slash 157 implied HN points 24 Jul 23
  1. Having conversations with customers can help reflect the value of your product and attract new customers.
  2. Understanding why customers do things is just as important as knowing what they do.
  3. To truly be customer-centric, B2B companies should invest more in human-centric research.
benn.substack 997 implied HN points 14 Apr 23
  1. dbt Labs' success has had a significant impact on people's lives by providing better job opportunities and higher salaries in the data industry.
  2. Despite its success, dbt Labs may face increasing competition in the future from startups and other companies that are challenging its position in the market.
  3. dbt Labs could consider evolving its business strategy by focusing on its community, exploring new product opportunities, or even exploring options like selling the company to better align with market trends and potential challenges.
One Useful Thing 506 implied HN points 18 Mar 24
  1. There are three main GPT-4 class AI models dominating the field currently: GPT-4, Anthropic's Claude 3 Opus, and Google's Gemini Advanced.
  2. These AI models have impressive abilities like being multimodal, allowing them to 'see' images and work across a variety of tasks.
  3. The AI industry lacks clear instructions on how to use these advanced AI models, and users are encouraged to spend time learning to leverage their potential.
Outlandish Claims 19 implied HN points 12 Jun 24
  1. Berkson's Paradox applies to various situations where multiple factors influence outcomes, leading to counterintuitive results.
  2. Applying Berkson's Paradox to different scenarios can reveal hidden correlations and insights, such as in medical studies, card games, or economic policies.
  3. The essence of Berkson's Paradox lies in understanding that when focusing on a specific subcategory, the causes of membership in that category can be more negatively correlated than in the broader category.
Polymathic Being 59 implied HN points 10 Aug 25
  1. AI can be a really helpful research tool. It can help you find good information and understand complex topics better.
  2. Using AI doesn't mean you stop thinking for yourself. You should work with AI to challenge your ideas and get different perspectives.
  3. AI is like a conversation partner for your research. It can help you explore ideas, ask questions, and keep you on track.
Math Meets Money 4 HN points 21 Aug 24
  1. The MECE principle helps organize complex data into clear categories that have no overlap. This is crucial for making sense of complicated systems, like businesses or markets.
  2. In business, customer demographics can be viewed as various sets that can show how different characteristics are related. Understanding these relationships can help companies better target their products.
  3. Using concepts from physics, like Hilbert spaces, can help refine how businesses analyze and transform customer data. This approach can lead to better insights into customer preferences and behaviors.
CAUSL Effect 1 HN point 17 Sep 24
  1. Over half a million workers have faced layoffs in the tech industry, showing how tough the job market can be right now.
  2. The data suggests that roles in product management, design, and research faced much higher layoff rates compared to engineering positions.
  3. These layoffs were often driven by companies needing to cut costs quickly due to changing market conditions, not by the employee's performance.
Technology Made Simple 139 implied HN points 25 Apr 23
  1. Statistics can be misleading if affected by bias, which is a flaw in experiment design or data collection process.
  2. Biases affect everyone and can be exploited by manipulative individuals like politicians and salespeople.
  3. Common statistical biases include selection bias, recall bias, and observer bias, which can all be combated by slowing down and evaluating claims carefully.
Sarah's Newsletter 139 implied HN points 15 Aug 23
  1. Fully personalized user journeys rely on correlating anonymous source events to authenticated users.
  2. Identity resolution involves collecting anonymous visitor data, mapping them to authenticated users, and merging duplicate accounts.
  3. Implementing event tracking through cookies and URL parameters is crucial for resolving identities across applications and domains.
School Shooting Data Analysis and Reports 79 implied HN points 16 Jan 24
  1. Aviation emphasizes near-miss reporting to enhance safety by openly sharing incidents that almost caused harm.
  2. Schools can learn from aviation by implementing a similar culture of prioritizing safety and reporting near misses, as demonstrated in the case of a school shooting incident in South Dakota.
  3. Defining near misses in the context of school shootings involves factors like detailed plans, multiple weapons, excessive ammunition, gun malfunctions, and successful interventions.
The Data Score 138 implied HN points 24 May 23
  1. Leveraging alternative data for revenue estimates goes beyond traditional transaction data, focusing on customer acquisition and retention insights.
  2. Applying the customer acquisition funnel framework to alternative data can help identify early trends and potential growth issues in a business.
  3. Monitoring the journey from awareness to loyalty using alternative data sets can offer valuable insights for predicting sustainable revenue growth beyond the short term.
Making Connections by Jax 137 implied HN points 12 Jun 23
  1. Marketing attribution is becoming more challenging due to privacy regulations like Apple's App Tracking Transparency.
  2. Marketing Mix Modelling (MMM) provides a top-down approach to understanding ROI by analyzing historical data and external influences.
  3. Lift tests offer a bottom-up method to prove causation in marketing effectiveness, requiring experimentation with test and control groups.
The Shake 137 implied HN points 26 Mar 23
  1. The Shake V2 is a brand new version of The Shake that has officially launched.
  2. The Shake is now more than just a newsletter and has evolved into a data provider, resource hub, and product lab.
  3. The Shake V2 will continue to offer on-chain analysis, interactive educational tools, and expand into the greater DWeb ecosystem.
VuTrinh. 39 implied HN points 09 Apr 24
  1. LedgerStore at Uber can handle trillions of indexes, making it a powerful tool for managing large-scale data efficiently.
  2. Apache Calcite helps build flexible data systems with strong query optimization features, which are vital for many data applications.
  3. Spotify's data platform plays a critical role in their operations, guiding how to build effective data systems in organizations.
Shrek's Substack 4 HN points 19 Aug 24
  1. The way you ask questions and set the model's temperature can really affect how well AI solves math problems. Clear prompts and specific instructions can help improve its accuracy.
  2. AI like GPT-4o struggles with big numbers and can make mistakes about half the time when calculating linear equations. It works better with smaller numbers.
  3. It's important to be careful when using AI for math, especially in education. Using other tools to double-check results can help avoid mistakes.
Rod’s Blog 99 implied HN points 04 Dec 23
  1. Jon and Sofia used KQL queries to identify and isolate an infected computer in the finance department.
  2. The malware was discovered disguised as a legitimate application, hidden in the Recycle Bin to avoid detection.
  3. Jon and Sofia's discovery of the global financial breach hints at a larger, more sinister threat by a group known as Night Princess.
Technology Made Simple 139 implied HN points 21 Mar 23
  1. Linear Algebra is crucial for software engineers, especially for operations involving vector and matrix operations. Understanding the basics is key for most developers.
  2. Probability and Statistics play a significant role in analyzing data, and even non-AI professionals can benefit from grasping concepts like causal inference. Focus on foundational principles before diving deeper.
  3. Calculus, though important, may not be essential for all software engineers. Studying up to Calc-2 is generally adequate, as it appears in various other topics.
Substack 319 implied HN points 29 Jul 24
  1. Substack now allows users to embed prediction markets from Polymarket in their posts. This can make articles more engaging by providing real-time data on trending topics.
  2. The new feature is launching just in time for the 2024 Paris Olympics, letting writers easily add betting odds on various events. This could enhance the coverage of the Olympics on Substack.
  3. Polymarket is also starting a news site on Substack called 'The Oracle,' which will give insights and analyses based on active prediction markets. This aims to help readers understand global events better.
Technically 14 implied HN points 11 Dec 25
  1. Evals are software tests for AI that turn fuzzy model outputs into measurable metrics so you can find and fix errors instead of guessing.
  2. Look at your data first — analyze real outputs to spot where the model fails, because you can’t measure or fix problems you don’t identify.
  3. Start with simple keyword checks and assertions before building complex “LLM-as-judge” setups, and iterate: test, fix, measure, repeat; otherwise your system just feels like a slot machine.
benn.substack 485 implied HN points 09 Feb 24
  1. Dan Campbell and the Detroit Lions have been aggressive in going for it on fourth downs
  2. Data can provide small advantages in decision-making, especially in frequent, low-leverage situations
  3. It's more effective to focus on doing what you're naturally good at and doing it consistently rather than constantly pursuing big data-driven optimizations
Rod’s Blog 99 implied HN points 27 Nov 23
  1. KQL's search operator is a powerful tool for finding potential threats in a company's data environment.
  2. Using specific queries like filtering by tables and applying operators like 'has' can help pinpoint suspicious activities in data.
  3. Collaborating with trusted teammates is crucial in verifying and responding to potential cybersecurity threats promptly.
Samstack 960 implied HN points 19 Feb 23
  1. Economic growth should not be the sole focus, quality matters too.
  2. Analogies for progress can overlook the importance of innovation.
  3. Consider the reliability and representation of data in discussions and surveys.
Rod’s Blog 59 implied HN points 12 Feb 24
  1. Spear phishing is a serious cyber-attack that targets specific individuals or organizations. Microsoft Sentinel's tools can help detect and prevent these types of threats.
  2. Microsoft Sentinel allows for the creation of custom analytics rules based on KQL queries to identify potential spear phishing activities. This helps in early detection of threats.
  3. Automation and playbooks in Microsoft Sentinel enable immediate responses like blocking URLs or initiating password resets upon detecting a spear phishing attempt.
Unconfusion 39 implied HN points 31 Mar 24
  1. Using silly examples to teach correlation and causation can let students off too easily. It's important to challenge them with examples that make them think.
  2. Most teaching examples use time-series data, but many real-world correlations don't fit this model. We should focus on typical variations found in research.
  3. Mixing random correlations with spurious connections creates confusion. Teaching should clearly explain how confounders can lead to false relationships.
Rozado’s Visual Analytics 150 implied HN points 28 Jan 25
  1. OpenAI's new o1 models are designed to solve problems better by thinking through their answers first. However, they are much slower and cost more to run than previous models.
  2. The political preferences of these new models are similar to earlier versions, despite the new reasoning abilities. This means they still lean left when answering political questions.
  3. Even with their advanced reasoning, these models didn't change their political views, which leads to questions about how reasoning and political bias work together in AI.
School Shooting Data Analysis and Reports 19 implied HN points 01 Jun 24
  1. The number of school shooting incidents in May 2024 continues a rising trend over the last 3 years, but the increase from 2023 to 2024 is not exponential.
  2. The number of victims in May 2024 is higher compared to 2023 but notably lower than in 2022, when a tragic incident in Uvalde involved multiple fatalities and injuries.
  3. In May 2024, shootings often occurred at night and during school events like graduations, emphasizing the importance of proactive policing, as incidents frequently happened during unauthorized post-graduation parties on campus.