The hottest Data Analysis Substack posts right now

And their main takeaways

Series B Activity: January 2024

Magid and Co • 39 implied HN points • 08 Feb 24

💼 Business Data Analysis

Series B deal volume increased significantly in January compared to December, which is positive news for founders seeking funding.
Data focused on Series B deals globally (excluding China) with amounts raised over $5M and companies not centered on therapeutics.
The post provides insights into recent Series B activity, highlighting key statistics and trends in the sector.

Spam in the firehose

Conspirador Norteño • 128 implied HN points • 06 Dec 24

🕹 Technology Data Analysis

Monitoring the Bluesky firehose can help quickly spot fake accounts. By looking for repeated names and profiles, it's easier to identify spam activity.
A large number of spam accounts often share similar biographies. One group had over a thousand accounts with variations of the same few phrases.
Many spam accounts use stolen images as profile pictures. This makes them look less authentic and easier to identify as spam.

My talk at the CHD conference on what the data tells us

Steve Kirsch's newsletter • 4 implied HN points • 03 Jan 26

🏥 Health Politics Data Analysis

The central claim is that COVID vaccines offered no real benefit and instead caused net harm.
A conference presentation uses data and analysis to argue and support that conclusion.
A video recording and slide deck of the talk are available online for people to review the evidence themselves.

Lawler: Single-Family Rent Trends at INVH and AMH

CalculatedRisk Newsletter • 14 implied HN points • 04 Nov 25

💰 Finance Data Analysis

Invitation Homes and American Homes 4 Rent are two big players in the single-family rental market. They're important to watch because they can show how rent prices are changing.
Recent trends indicate fluctuations in single-family rental prices. It's helpful to pay attention to these trends if you're interested in renting or investing in housing.
Understanding these rental trends can give you insights into the overall housing market. It can help you make better decisions about where to live or invest.

The cult of data has gone too far

Wednesday Wisdom • 113 implied HN points • 01 Jan 25

🕹 Technology Data Analysis

Relying too much on data can lead to wrong decisions because numbers don't always tell the full story. Sometimes, human judgment or understanding is needed.
Data can create a false sense of certainty, making people ignore the uncertainties and assumptions behind those numbers. It's important to be honest about what the data truly represents.
Setting goals based on numbers can make teams lose sight of the real-world processes they are supposed to improve. Chasing metrics blindly can lead to poor outcomes.

Microsoft Sentinel SOC 101: How to Detect and Mitigate Supply Chain Attacks with Microsoft Sentinel

Rod’s Blog • 79 implied HN points • 25 Sep 23

🕹 Technology Data Analysis

Supply chain attacks target vulnerabilities within the chain, aiming to compromise products or services before reaching end-users. They pose a significant threat due to their indirect nature, multi-stage process, and high impact potential.
Kusto Query Language (KQL) in Microsoft Sentinel is essential for detecting anomalies or patterns linked to supply chain attacks. By using KQL queries, organizations can identify unusual activities and potential threats.
Microsoft Sentinel's integration with various tools and automated response capabilities, such as Playbooks, enables swift detection, investigation, and mitigation of supply chain threats. Leveraging these features enhances security measures.

KQL Queries Behind the Microsoft Sentinel Overview Page

Rod’s Blog • 79 implied HN points • 15 Jun 23

🕹 Technology Data Analysis

Customers and partners often inquire about the KQL queries powering the Microsoft Sentinel Overview page
The KQL queries for Incident, Automation, and Data can be found in Rod Trent's Microsoft Sentinel GitHub repo
The Analytics widget uses API instead of KQL queries

NVIDIA: How could alternative data be used to assess its long-term potential?

The Data Score • 79 implied HN points • 15 Jun 23

💰 Finance Data Analysis

Assessing a company's long-term potential requires more than just traditional financial metrics, and alternative data sources can provide valuable insights.
For NVIDIA, alternative data can illuminate aspects like market presence, evolving applications, competitive threats, and supply chain investments, aiding in making informed investment decisions.
Key questions to answer include evaluating demand, diverse use cases for GPU chips, potential competitors, and investment needs, all essential for understanding NVIDIA's future prospects.

Monitor Azure Open AI Deployments with Microsoft Sentinel

Rod’s Blog • 79 implied HN points • 20 Apr 23

🕹 Technology Data Analysis

Defender for Cloud Apps can now monitor Azure Open AI activity, making it easier to track and locate activity using Microsoft Sentinel.
Utilize KQL queries to identify Azure Open AI deployments and create a maintained Watchlist in Microsoft Sentinel for easy monitoring.
Automate the updating of the Watchlist with Logic Apps to ensure it always contains the most up-to-date information on Azure Open AI instances.

Mixtape Mailbag #8: Continuous Triple Differences

Scott's Substack • 39 implied HN points • 05 Feb 24

🔬 Science Data Analysis

Triple difference design can be used with continuous treatment by defining the parameters based on dosage levels.
When treatment is continuous, the target parameter shifts from average treatment effect to average causal response function.
Continuous treatments require careful definition of parameters to compare different dosages along a treatment curve.

More than 30 unique tracking events will cause you problems

timo's substack • 78 implied HN points • 12 Feb 23

🕹 Technology Data Analysis

Having more than 30 unique tracking events can lead to problems in data adoption and productivity.
Too many unique events can lead to difficulties in analyst productivity and data exploration.
Implementing a lean event approach with a focus on good event design and ownership can help prevent issues caused by high event volumes.

Professor Robert Scragg tries to show the COVID vaccines are perfectly safe in order to help put Barry Young in jail

Steve Kirsch's newsletter • 7 implied HN points • 08 Dec 25

🏥 Health Politics Data Analysis

Scragg didn't provide evidence showing vaccines improve mortality rates. There was no clear proof that vaccinated people lived longer compared to unvaccinated in matched studies.
He failed to analyze important data that could help prove vaccine safety. The data was available but he chose not to use it, which is confusing since it's crucial for understanding the truth.
Health New Zealand hasn't analyzed their own data on vaccine safety, which raises questions about their reliability. They should openly share this information to help everyone understand the real impacts of the vaccines.

In good company

Tabletops • 78 implied HN points • 03 Jul 23

🕹 Technology Data Analysis

Apple Stores often choose locations near other popular brands like Victoria's Secret, Lululemon, and Sephora.
Most Apple Stores are located on the main level of the malls they are in.
Apple Store distribution seems to loosely correlate with mall operators like Simon and Brookfield.

Who wants to make a killing together?

Innovation Nation • 78 implied HN points • 09 Aug 23

💰 Finance Data Analysis

Identifying buildings likely to default can be done using AI and various data sources.
Banks could be potential counter-parties for this investment strategy.
There is potential for huge profits by betting against commercial real estate using a well-informed strategy.

March Madness Forecasts: FiveThirtyEight vs. Betting Markets

Mike’s Blog • 78 implied HN points • 07 Apr 23

💰 Finance Data Analysis

Betting markets slightly outperformed FiveThirtyEight in predicting NBA, NFL, and MLB games.
New data collected for March Madness shows both FiveThirtyEight and betting markets performed similarly, and neither significantly outperformed.
Hypothesis: Both betting markets and experts may have worse accuracy in playoffs and tournaments compared to regular season games.

Are Carjackings Falling Because Cars Are Easier to Steal?

Jeff-alytics • 78 implied HN points • 17 Apr 23

🇺🇸 U.S. Politics Data Analysis

Auto thefts rising might be causing a decline in carjackings.
Stealing cars is easier and less risky than carjacking.
Available data indicates a potential trend of falling carjackings, but more data is needed for a conclusive answer.

Building Cyborgs

Condensing the Cloud • 78 implied HN points • 01 Mar 23

🕹 Technology Data Analysis

Identifying problems that need to be solved is crucial in building a successful business.
Leveraging generative AI like GPT in conjunction with human intelligence can create innovative solutions.
Bots and cyborgs represent two paradigms of AI businesses, with cyborgs showing more promise for startups due to their collaborative nature.

Board members as BDRs 🤝

The SaaS Baton • 78 implied HN points • 10 May 23

💼 Business Data Analysis

Board members can be valuable BDRs due to their connections and experience.
Data maturity progresses from gut feelings to data-driven decisions through central data platforms and data analysis.
Explaining the unique potential and market dynamics of emerging regions can help attract investors and growth opportunities.

How I made "The Probability Times"

Nerology • 142 implied HN points • 29 Oct 24

🇺🇸 U.S. Politics Data Analysis

The project turns election predictions into real newspaper headlines, making stats feel more concrete. Each data point in the simulations gets a corresponding news story.
Using a script, detailed election results from states can be generated, summarizing victories and close races. This gives journalists useful info to write about.
AI tools were utilized to create news articles and images, making the project visually appealing and engaging. The tech helps bring the election outcomes to life with visuals and compelling stories.

Weekly Notes

Rethinking Software • 149 implied HN points • 23 Sep 24

🕹 Technology Data Analysis

Story points are basically just hidden time estimates for tasks in software development. Understanding this can help with better planning and predicting when a project will be finished.
Product management should be like a party host, making sure developers and customers communicate and enjoy their time together. This creates a better experience for everyone involved.
There are ways for companies to run without traditional management, like the tomato processor Morning Star. This might be a model to explore for improving the software industry's workflow.

Why I’m Still Not Sick of ChatGPT

Ironic Sans • 354 implied HN points • 24 Oct 23

🕹 Technology Data Analysis

Writing about ChatGPT is becoming cliché and lazy.
Using ChatGPT to organize information can make life easier.
ChatGPT can assist in practical tasks, like mapping doctor locations.

The Impact of AI on the Legal System

Rod’s Blog • 39 implied HN points • 26 Jan 24

🕹 Technology Data Analysis

President Biden's Executive Order outlines key principles and guidelines for AI use in the US legal system.
Generative AI accelerates tasks like idea generation but struggles with intricate problem solving.
AI is transforming legal professions by automating tasks, assisting with legal research, and improving efficiency in legal work.

The KQL Mysteries: Prologue

Rod’s Blog • 59 implied HN points • 20 Nov 23

🕹 Technology Data Analysis

Jon Block, a top-tier security analyst, used KQL - Kusto Query Language, to tackle cyber threats. This powerful query language helped him root out elusive cyber threats and protect digital landscapes.
Jon's journey into cybersecurity began with self-taught programming and a determined spirit after being a victim of a cyber attack. His dedication led him to become a renowned cybersecurity professional using KQL.
KQL's elegance and power allowed Jon to shine in the cybersecurity realm, offering protection to clients from all levels of society. His mastery of KQL made him a formidable force against cybercriminals.

Hacking Hacker News

Bytewax • 39 implied HN points • 25 Jan 24

🕹 Technology Data Analysis

Combining Bytewax, Proton, and Grafana can create a customizable dashboard for personalized Hacker News stories
Bytewax simplifies processing streaming data and allows for custom input connectors
Proton, built on ClickHouse, provides a SQL engine for fast data processing and seamless integration with Grafana

How Many Sexual Misconduct Allegations Are False?

Cremieux Recueil • 356 implied HN points • 17 Oct 23

🇺🇸 U.S. Politics Data Analysis

81% of women have experienced some form of sexual harassment at least once in their lifetime.
Studies show that only 5-6% of sexual misconduct accusations are false.
Many accusations of sexual misconduct are unsubstantiated, suggesting a need for further research.

Actions, no words

Interesting Data Gigs Weekly • 39 implied HN points • 22 Jan 24

🚌 Education Data Analysis

We are in the era of taking the initiative
Data Geek, it's time to take action
Actions speak louder than words

Case Study: Scaling customer intelligence

Artificial Ignorance • 117 implied HN points • 27 Nov 24

💼 Business Data Analysis

AI can help analyze a large number of sales calls quickly instead of relying on humans to do it manually. This makes it easier to understand customer behaviors and needs.
Choosing the right AI model is important. Higher quality models may cost more, but they can provide better and more accurate results over cheaper options.
It’s essential to make the data user-friendly. Organizing and making information accessible helps teams use insights from the analysis effectively.

Interpret Complex Pipelines By Drawing A Box

Mindful Modeler • 159 implied HN points • 22 Nov 22

🕹 Technology Data Analysis

Interpretation of complex pipelines can be challenging when model changes impact interpretability. Use model-agnostic interpretation methods to interpret arbitrary pipelines.
Think of predictive models as pipelines with various steps like transformations and model ensembles. View the entire pipeline as the model for better interpretation.
Draw the box around the entire pipeline in model-agnostic interpretation to gain insights into feature importance, prediction changes, and explanations, disregarding the specific models within the pipeline.

Microsoft Sentinel SOC 101: How to Detect and Mitigate Rare Domains Seen in Cloud Logs with Microsoft Sentinel

Rod’s Blog • 59 implied HN points • 06 Nov 23

🕹 Technology Data Analysis

Rare or malicious domains in cloud logs can be used by attackers for phishing, malware delivery, data exfiltration, and command and control.
Detection and analysis of rare domains in cloud logs can help identify threats like phishing attacks, malware delivery, data exfiltration, and command and control activities.
Microsoft Sentinel offers features like built-in hunting queries, automation rules, and playbooks to help detect, enrich, validate, and respond to rare domains in cloud logs.

Long run Long Covid post

Logging the World • 99 implied HN points • 18 Dec 22

🏥 Health & Wellness Data Analysis

The idea of COVID risks changing over time due to factors like vaccination and new variants must be understood.
The concept of Long COVID being like taking a risk with 'Russian roulette' might not accurately represent the real-world data.
Severe Long COVID conversion rates don't seem to be as high as initially expected, indicating the situation is different than a constant risk per infection.

The Myth of Nigerian Excellence

Cremieux Recueil • 277 implied HN points • 11 Jan 24

🔬 Science Data Analysis

Claims about Igbo intelligence lack empirical support.
Nigerians in America show major regression to the mean in education and income.
The evidence suggesting Nigerian immigrants to the U.S. are as bright as White Americans is minimal.

Prompt Engineering with GPT-4: Charting and Mapping European Tourism Trends

Data at Depth • 19 implied HN points • 11 Apr 24

🕹 Technology Data Analysis

Efficiency is highly sought after state of being for coders and data analysts. GPT-4's Code Interpreter functionality significantly streamlines the process of transforming CSV data into data visualizations.
GPT-4 can generate Python code for various types of data visualizations like line charts, bar charts, and area charts. Simply prompting GPT-4 with specific information can quickly produce comprehensive visualizations.
GPT-4 can be utilized to filter datasets, analyze trends, and create innovative visual representations like choropleth maps. Incorporating GPT-4 into data analysis workflows can lead to faster and efficient results.

Data at Depth Newsletter 4 - Consistency in Creating & GPT-4's Custom Instructions Tool

Data at Depth • 39 implied HN points • 11 Jan 24

🕹 Technology Data Analysis

Consistency is crucial for success, according to top creators. It's important to maintain consistency even during challenging times.
Data at Depth newsletter is reader-supported. Consider subscribing to receive new posts and support the author's work.
Get a 7-day free trial to access the full post archives of Data at Depth by subscribing.

Evaluating LLM Agents and Applications

LLMs for Engineers • 79 implied HN points • 11 Jul 23

🕹 Technology Data Analysis

Evaluating large language models (LLMs) is important because existing test suites don’t always fit real-world needs. So, developers often create their own tools to measure accuracy in specific applications.
There are four main types of evaluations for LLM applications: metric-based, tools-based, model-based, and involving human experts. Each method has its strengths and weaknesses depending on the context.
Understanding how well LLM applications are performing is essential for improving their quality. This allows for better fine-tuning, compiling smaller models, and creating systems that work efficiently together.

5 Lessons from Stanford's COVID Conference

Vinay Prasad's Observations and Thoughts • 129 implied HN points • 06 Oct 24

🏥 Health Politics Data Analysis

Closing elementary schools during the pandemic may have been a bad idea because kids were not significant spreaders of COVID-19. Some experts, like Anders Tegnell from Sweden, believed this from the start.
Many people now agree that long school closures were harmful, but some didn't speak up about it at the time. It shows the importance of questioning popular opinions instead of just following the crowd.
Countries that had less income inequality tended to handle the pandemic better than those with more inequality. Access to basic healthcare might have played a bigger role than strict lockdowns or border closures.

Intermarriage in America Post-Loving v. Virginia

Cremieux Recueil • 253 implied HN points • 02 Feb 24

🎭️ Culture Data Analysis

Before Loving v. Virginia in 1967, state laws banning interracial marriage were common in the U.S., stretching back to the 1600s.
Since the legalization of interracial marriage, the rates have increased over time, showing a more mixed ethnoracial composition in America.
Analysis of interracial marriage rates can provide insights into race relations, impact of societal movements like the 'Great Awokening,' and patterns of intermixing across different races and sexes.

Science Fictions links for January 2024

Science Fictions • 248 implied HN points • 28 Jan 24

🔬 Science Data Analysis

Bad science continues to be published despite scandals and fraud being uncovered.
AI tools hold promise for scientific research but there are challenges in implementation and potential overclaiming.
Evidence of unethical practices like journal bribery and scientific fraud highlight ongoing issues in the scientific community.

eBook: Mastering AI Agents

TheSequence • 77 implied HN points • 07 Feb 25

🕹 Technology Data Analysis

You can learn to create effective AI agents with the right guidance. There's a helpful eBook that covers how these agents work and when to use them.
The book reviews three frameworks for developing AI agents, helping you choose what's best for your needs. It also shares case studies to show real-life applications.
It addresses common reasons AI agents fail and provides solutions to avoid these problems. This can help ensure your AI projects succeed.

$100K prize money to any epidemiologist or infectious disease doctor who will debate me live for 1 hour and show I got it wrong on COVID vaccine risk/benefit

Steve Kirsch's newsletter • 5 implied HN points • 12 Dec 25

🔬 Science Data Analysis

A $100,000 prize is offered to any US-based epidemiologist, infectious-disease specialist, or biostatistics professor with an h-index of 10+ to debate the mRNA COVID vaccine risk‑vs‑benefit live for one hour.
The challenge hinges on Czech KCOR data and asks the expert to show that the cumulative net mortality benefit of two or three mRNA doses in the first two years likely exceeds the mortality risk; the debate will have three mutually agreeable unbiased judges and 30 minutes per side.
Authorized employees of Pfizer or Moderna are explicitly invited to participate, framing the offer as a public call to prompt a real-time scientific dispute and draw attention to the vaccine safety question.

The hottest Data Analysis Substack posts right now

Magid and Co • 39 implied HN points • 08 Feb 24

Conspirador Norteño • 128 implied HN points • 06 Dec 24

Steve Kirsch's newsletter • 4 implied HN points • 03 Jan 26

CalculatedRisk Newsletter • 14 implied HN points • 04 Nov 25

The Data Score • 59 implied HN points • 05 Dec 23

Wednesday Wisdom • 113 implied HN points • 01 Jan 25

Rod’s Blog • 79 implied HN points • 25 Sep 23

Rod’s Blog • 79 implied HN points • 15 Jun 23

The Data Score • 79 implied HN points • 15 Jun 23

Rod’s Blog • 79 implied HN points • 20 Apr 23

Scott's Substack • 39 implied HN points • 05 Feb 24

timo's substack • 78 implied HN points • 12 Feb 23

Steve Kirsch's newsletter • 7 implied HN points • 08 Dec 25

Tabletops • 78 implied HN points • 03 Jul 23

Innovation Nation • 78 implied HN points • 09 Aug 23

Mike’s Blog • 78 implied HN points • 07 Apr 23

Jeff-alytics • 78 implied HN points • 17 Apr 23

Condensing the Cloud • 78 implied HN points • 01 Mar 23

The SaaS Baton • 78 implied HN points • 10 May 23

Nerology • 142 implied HN points • 29 Oct 24

Rethinking Software • 149 implied HN points • 23 Sep 24

Ironic Sans • 354 implied HN points • 24 Oct 23

Rod’s Blog • 39 implied HN points • 26 Jan 24

Rod’s Blog • 59 implied HN points • 20 Nov 23

Bytewax • 39 implied HN points • 25 Jan 24

Cremieux Recueil • 356 implied HN points • 17 Oct 23

Interesting Data Gigs Weekly • 39 implied HN points • 22 Jan 24

Artificial Ignorance • 117 implied HN points • 27 Nov 24

Mindful Modeler • 159 implied HN points • 22 Nov 22

Rod’s Blog • 59 implied HN points • 06 Nov 23

Logging the World • 99 implied HN points • 18 Dec 22

Cremieux Recueil • 277 implied HN points • 11 Jan 24

Data at Depth • 19 implied HN points • 11 Apr 24

Data at Depth • 39 implied HN points • 11 Jan 24

LLMs for Engineers • 79 implied HN points • 11 Jul 23

Vinay Prasad's Observations and Thoughts • 129 implied HN points • 06 Oct 24

Cremieux Recueil • 253 implied HN points • 02 Feb 24

Science Fictions • 248 implied HN points • 28 Jan 24

TheSequence • 77 implied HN points • 07 Feb 25

Steve Kirsch's newsletter • 5 implied HN points • 12 Dec 25