The hottest Data Analysis Substack posts right now

And their main takeaways

BattGPT or AI bubble?

Intercalation Station • 119 implied HN points • 15 Feb 23

🕹 Technology Data Analysis

Successful AI applications require large quantities of easily interpretable input data
Applying AI to batteries faces challenges due to the complex and non-reproducible nature of battery data
Data availability and quality remain key bottlenecks in using AI for battery research and development

Mapping With GPT-4's Data Analysis Capabilities: A Comprehensive Example

Data at Depth • 19 implied HN points • 26 Feb 24

🕹 Technology Data Analysis

Data analysis transforms raw numbers into meaningful stories, which can be challenging.
AI tools can efficiently assist in the task of converting data into narratives.
Consider exploring available tools that utilize AI for quicker and more effective data analysis.

When Statistics Lie. Anscombe's Quartet [Math Mondays]

Technology Made Simple • 79 implied HN points • 14 Nov 22

🔬 Science Data Analysis

Data exploration is crucial in data analysis for gaining useful insights.
Anscombe's quartet showcases how data sets with similar simple stats can have very different distributions.
Visualization is key in spotting patterns, trends, and outliers in data analysis.

From Lionesses to Flying Wingers: Revealing the New Heroes of The Beautiful Game

Workforce Futurist by Andy Spence • 244 implied HN points • 16 Aug 23

🎾 Sports Data Analysis

Tracking data in football helps with performance improvement and injury prevention.
Analyzing skill ecosystems is crucial in talent scouting, even in the workplace.
Using data to empower individuals to analyze their performance can lead to better organizational outcomes.

The Froyo Data Shop

Sarah's Newsletter • 159 implied HN points • 22 Mar 22

🕹 Technology Data Analysis

Self-service is about making choices with clear explanations and options.
Raw data without context can lead to misinterpretation and flawed analysis.
Data democratization needs testing, context building, and ongoing data literacy.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Election Update #11: The race hasn't "narrowed"

Phillips’s Newsletter • 80 implied HN points • 25 Oct 24

🇺🇸 U.S. Politics Data Analysis

Trump's support may be increasing, or Harris is holding her lead steady. It's not clear which one is happening right now.
Polls show that despite some recent changes, Harris's overall lead is still solid according to longer-term trends.
Even though the numbers seem to be tightening, this election still has one of the most stable polling environments in US history.

Getting to the Root of Engineering Improvement with DORA Core

Dev Interrupted • 177 implied HN points • 04 Jan 24

🕹 Technology Data Analysis

DORA Core offers a concise framework of capabilities, metrics, and outcomes to help teams apply research findings.
DORA constantly updates its methodology to keep pace with technological changes and evolving practices.
The DORA Core model shows how capabilities predict performance, which then predicts outcomes, aiding in continuous improvement efforts.

GPT-4 captures judgments about semantic relatedness quite well

The Counterfactual • 59 implied HN points • 18 May 23

🕹 Technology Data Analysis

GPT-4 is really good at understanding word similarities. In tests, it matched human opinions better than many expected.
Sometimes GPT-4 thinks that certain words are more similar than people do. It tends to view pairs of words like 'wife' and 'husband' as more alike than humans generally agree on.
Using GPT-4 for semantic questions could save time and money in research, but it's still important to include human input to avoid biases.

Creating a Security Posture Report for a Specific Azure Subscription

Rod’s Blog • 19 implied HN points • 13 Feb 24

🕹 Technology Data Analysis

Creating a security posture report for a specific Azure subscription provides enhanced visibility into the security state of assets and workloads, aiding in identifying potential vulnerabilities.
The report includes guidance for improvement with hardening recommendations to help efficiently enhance security posture.
Azure Secure Score assists in prioritizing security recommendations for effective triage to enhance security posture and align with compliance standards.

Using Microsoft Sentinel to Monitor, Detect and Alert Bad AI Content

Rod’s Blog • 39 implied HN points • 12 Oct 23

🕹 Technology Data Analysis

Microsoft Sentinel can be used to monitor and detect bad AI content, but it is important to consider whether it is the most efficient use of resources.
Organizations may choose to ingest AI data into Microsoft Sentinel, create a watchlist of bad content, and set up alerts to detect issues.
Responsibilities for handling AI content alerts can be appropriately assigned to HR or relevant teams, rather than overwhelming security teams.

Series B Activity: September 2023

Magid and Co • 39 implied HN points • 11 Oct 23

💰 Finance Data Analysis

Series B deals show a trend of shrinking, with few exceptions like a $1.6B raised by a steel company.
In September 2023, nine rounds in various sectors, from AI to defense, exceeded $100M.
Data on Series B deals worldwide (excluding China) above $5M is provided, excluding therapeutics-focused companies.

6 Tactics To Accelerate Your On-The-Job Product Management Learning Curve

Harmony • 39 implied HN points • 20 Jun 23

💼 Business Data Analysis

Seize opportunities to work on projects others avoid to gain hands-on experience.
Understand customer needs by talking to them regularly and building relationships with customer-facing roles.
Measure product success with data-driven decisions, and seek mentorship to guide professional growth.

Grading Your Crime Data Quiz

Jeff-alytics • 39 implied HN points • 05 May 23

🚌 Education Data Analysis

Readers struggled with a crime data quiz and need to do better on the final exam.
Questions on past crime trends and definitions were answered correctly by most readers.
Challenges were faced on questions about specific crime data facts and statistics.

How Do People Use Macs?

CIRP - Apple Report • 39 implied HN points • 28 Jun 23

🕹 Technology Data Analysis

About 80% of Mac buyers use their computers for personal activities.
Around half of Mac buyers use their computers for business purposes.
Almost 40% of Mac buyers use their computers for education.

Future of Digital health products with AI first strategy

healthviva • 39 implied HN points • 30 May 23

🏥 Health & Wellness Data Analysis

AI-powered digital health products can revolutionize healthcare by improving patient care and reducing costs.
Key trends in the future of digital health products include personalizing healthcare with AI and automating tasks to free up healthcare professionals.
Challenges in developing AI-powered digital health products include the lack of data and regulatory hurdles, despite opportunities for AI to enhance patient care, reduce costs, and improve healthcare delivery.

Must Learn KQL Part 4: Search for Fun and Profit

Rod’s Blog • 39 implied HN points • 31 May 23

🕹 Technology Data Analysis

The Kusto Query Language (KQL) search operator is a powerful tool for verifying the existence of certain elements within an environment.
Using KQL for security purposes involves answering questions like 'Does it exist?', 'Where does it exist?', and 'Why does it exist?'
KQL allows for detailed searches across specific tables in tools like Microsoft Office and Defender for Endpoint by leveraging wildcard characters.

Shortcut Way to Create XPath Queries for Microsoft Sentinel DCRs

Rod’s Blog • 39 implied HN points • 10 Apr 23

🕹 Technology Data Analysis

You can create XPath queries to use Data Collection Rules (DCRs) for the Azure Monitor Agent.
A shortcut trick to create XPath queries is using Event Viewer on a Windows system.
In Event Viewer, filter log files, enter Event IDs, check the XML tab for the XPath query, and use it in your DCR.

My Script

The Heart Attack Diet • 39 implied HN points • 08 Aug 23

🕹 Technology Data Analysis

Open source is a development methodology, while free software is a social movement.
The content includes code for weight graphing using Python tools like matplotlib.
The post showcases historical weight data and visualizes it using color-coded regions in the graph.

Cross-workspace Query Best Practice for Microsoft Sentinel

Rod’s Blog • 39 implied HN points • 17 Apr 23

🕹 Technology Data Analysis

Cross-workspace queries in Microsoft Sentinel are crucial for managing multiple workspaces or customers.
When using cross-workspace queries, it is more efficient to use the workspace ID rather than names or fully qualified names.
Workspace IDs can be found in the Overview pane of the Log Analytics workspace or using a KQL query in Azure Resource Graph Explorer.

Accelerating genetic design

The Century of Biology • 272 implied HN points • 26 Mar 23

🔬 Science Data Analysis

Multiple important technological paradigms are converging in the life sciences, impacting life on various scales.
Synthetic biology focuses on designing new genetic circuits to program cells for new tasks.
Using a platform like CLASSIC, genetic circuits can be systematically tested to learn composition-to-function relationships.

The KQL Mysteries: Chapter 8

Rod’s Blog • 19 implied HN points • 06 Feb 24

🕹 Technology Data Analysis

A major security breach has occurred with sensitive data stolen, leading to a need for urgent action to track down the threat actor.
Jordan quickly jumps into action, using KQL queries to analyze data and identify patterns associated with the suspected threat actor.
The story leaves readers with a cliffhanger, hinting at upcoming developments and ensuring engagement for the next chapter.

Randomization maximalism

inexactscience • 39 implied HN points • 09 Aug 23

🔬 Science Data Analysis

Relying only on randomized experiments can be limiting. It's important to consider all types of evidence based on their quality.
Not every decision needs a complex A/B test; sometimes simpler data or even gut feelings are enough.
We should weigh the cost of getting reliable data against the value it provides. For some choices, high-quality data is a must, but for others, less rigorous information can do the job.

Adding Noise Improves RAG Performance

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 02 Feb 24

🕹 Technology Data Analysis

Adding irrelevant documents can actually improve accuracy in Retrieval-Augmented Generation systems. This goes against the common belief that only relevant documents are useful.
In some cases, having unrelated information can help the model find the right answer, even better than using only related documents.
It's important to carefully place both relevant and irrelevant documents when building RAG systems to make them work more effectively.

The Future of AI Diplomacy: Can The State Department Grok AI?

ChinaTalk • 133 implied HN points • 04 Mar 24

🇺🇸 U.S. Politics Data Analysis

AI can enhance diplomacy by streamlining bureaucratic tasks, providing accurate data for negotiations, and improving analysis processes.
Risk management in the State Department varies for different tasks: while tasks like HR and IT services can run faster to match the private sector, activities like foreign assistance and passport services require a higher burden due to their public impact.
Strategic use of transparency can be a strength for the U.S. in diplomacy, as seen in the Biden administration's doctrine. Leveraging transparency internally and externally can have strategic advantages over closed societies.

The KQL Mysteries: Chapter 7

Rod’s Blog • 19 implied HN points • 30 Jan 24

🕹 Technology Data Analysis

Jordan Alghamdi is a skilled data analyst in Saudi Arabia who blends tradition with modern technology in her work at a state-of-the-art data center.
The data center where Jordan works represents Saudi Arabia's push towards modernization while preserving tradition, showcasing the country's advancement in technology.
Jordan's use of KQL, a query language, showcases her analytical skills as she unravels complex data to solve mysteries and address potential threats.

Maximizing Product Management Efficiency with ChatGPT and Generative AI

An Innovator's Sketchbook • 19 implied HN points • 28 Jan 24

🕹 Technology Data Analysis

Leverage AI to boost personal productivity in product management through planning, execution, and user feedback analysis.
Use large language models (LLMs) in product strategy for idea generation, evaluation, and decision-making.
Optimize day-to-day efficiency by using AI to break down goals into manageable tasks and plan daily schedules.

Big Tech and Generative AI Q3 '24 Update

Tanay’s Newsletter • 63 implied HN points • 04 Nov 24

🕹 Technology Data Analysis

Amazon is making big strides in AI by providing tools for developers and creating custom chips. They are seeing huge interest in their AI services, which are growing fast despite lower profit margins.
Google is using AI to improve its search capabilities and has rolled out new features to enhance user experience. Their AI models, called Gemini, are being adopted widely across their products and they are investing significantly in infrastructure.
Apple has launched its AI system, Apple Intelligence, focusing on privacy and enhancing the user experience of their products. Although they're investing in AI, their spending is still lower compared to competitors, but they plan to increase their efforts.

Series A activity: Week of January 15, 2024

Magid and Co • 19 implied HN points • 22 Jan 24

💼 Business Data Analysis

In the last week, there were only 15 Series A deals with funding amounts ranging from $5.5M to $55M.
The focus was on Series A deals worldwide (excluding China), where the raised amount was over $5M and not in therapeutics.
Readers can subscribe for free to receive new posts and support the author's work on Magid and Co.

How many people are just like you?

Jovex Substack • 19 implied HN points • 20 Jan 24

🔬 Science Data Analysis

The more unique facts about a person, the more identifiable they become. Less than 10 specific facts could potentially distinguish an individual from everyone else.
Correlation between personal facts may impact the uniqueness calculation, but still requires around 10 moderately specific facts to identify someone.
Utilizing specific facts can even further reduce the number of facts needed for identification. Such calculations can also determine how few people share similar circumstances, making each individual's story unique.

An AWS For Sequencing?

ASeq Newsletter • 58 implied HN points • 16 Nov 24

🕹 Technology Data Analysis

Bioinformatics companies often struggle to succeed on their own, but some are finding unique ways to add value by providing analysis of sequencing data from external service providers.
Just like how companies can use AWS for their server needs, the idea is to create an AWS-like platform specifically for DNA sequencing, making services easier and more accessible.
Building a platform for sequencing could lower barriers for businesses and encourage new applications in the field, opening up more opportunities for innovation.

Last word on LNT

Gordian Knot News • 139 implied HN points • 14 Jan 24

🔬 Science Data Analysis

Linear No-Threshold (LNT) model in radiation exposure prediction is criticized for being inaccurate.
Comparing different dose rate profiles with the same total dose is crucial to understanding radiation harm models.
Dose rate is a critical factor in DNA damage repair, impacting cancer incidence predictions in radiation exposure.

Angels in the Architecture, pt IV

Premium Grind • 19 implied HN points • 19 Jan 24

🕹 Technology Data Analysis

Interpreting VAS heatmaps is challenging due to lack of established guidelines and overlaps in definitions.
Studies have shown that traditional civic architecture consistently draws more viewer attention than modern styles.
Discrepancies exist between VAS results and actual human-subject eye-tracking studies, raising questions about accuracy and interpretation.

How to Sell Products and Retain Customers

Sarah's Newsletter • 119 implied HN points • 12 Apr 22

💼 Business Data Analysis

Understand your audience and solve their real problems to attract and retain customers.
Provide a smooth onboarding experience to help users transition from inefficient processes to using your product.
Customers who find your product valuable will be forgiving of small bugs, but focus on seamless integration within their ecosystem.

Identifying unmaintained open source packages at scale

Once a Maintainer • 5 implied HN points • 20 Nov 25

🕹 Technology Data Analysis

Open source packages can become abandoned when original developers lose interest, meaning they might not get important updates or security fixes.
To find abandoned packages, you can look at factors like how often the package has updates, the activity of commits, and what maintainers say about the package.
Machine learning models can help predict whether a package might be abandoned by combining various factors like release frequency, maintainer communication, and community engagement.

Data in a Downturn

Data People Etc. • 231 implied HN points • 20 Mar 23

🕹 Technology Data Analysis

Data teams are facing challenges with tool abandonment in the current economy.
Databases remain crucial in the data stack, with less need for new, specialized tools.
Building trust and bridging gaps between data and engineering teams is vital for successful data applications.

It Only Takes One Data Point to Disprove an Investment Thesis

The Data Score • 19 implied HN points • 09 Jan 24

💰 Finance Data Analysis

It only takes one data point to disprove an investment thesis by testing for the counterfactual, which allows identifying data points that go against the thesis.
A question-driven approach to investing focuses on formulating the right questions to deeply understand the investment landscape, prioritizing curiosity and critical thinking.
Designing an investment thesis involves setting measurable outcomes, defining timeframes, identifying dependencies, establishing checkpoints, and being aware of the current valuation. It's crucial to recognize what's in the valuation already and how your views differ.

AI and Junior White Collar Automation: Update after EIG’s New Report

State of the Future • 12 implied HN points • 12 Aug 25

🕹 Technology Data Analysis

AI is changing how work gets done, especially in handling tasks. It makes sense to focus on how AI affects the types of jobs rather than just the number of jobs.
There's evidence that AI hasn't led to big job losses in white-collar roles yet, but it's changing the landscape of entry-level positions. Many jobs for new graduates are declining.
As companies adopt AI, they are starting to shift tasks among current workers instead of laying people off. This means the impact of AI on jobs might show up later as firms adjust their hiring practices.

How TikTok breaks economics [Finance Fridays]

Technology Made Simple • 59 implied HN points • 29 Oct 22

💰 Finance Data Analysis

TikTok struggles with profitability due to competition, lack of valuable data, and the expensive analysis of user behavior.
The CCP's involvement in ByteDance enables them to fund TikTok despite losses for geopolitical influence, impacting the content promoted and the platform's sustainability.
Banning TikTok may not address the root issues; education on health, mental wellness, skepticism, and maintaining real social connections are vital for healthier social media engagement.

Case-Shiller: National House Price Index Up 2.3% year-over-year in May

CalculatedRisk Newsletter • 14 implied HN points • 29 Jul 25

💰 Finance Data Analysis

In May, the national house prices increased by 2.3% compared to last year. This shows the market is still growing, but the growth is slowing down.
The数据显示，房价在5月出现了连续三个月的月度下降. This means the prices are going down a bit after rising for a while.
Some cities are seeing bigger drops, like San Francisco, where prices fell 8.2%. This suggests that not all areas are doing well in the housing market.

Driving Change: 8 Learnings.

The Future Does Not Fit In The Containers Of The Past • 113 implied HN points • 28 Jan 24

💼 Business Data Analysis

Change is difficult but necessary to avoid irrelevance.
Understanding human emotions and incentives is key to driving change.
Reducing fear, addressing company culture, and inspiring leadership are crucial in navigating change.