The hottest Data Analysis Substack posts right now

And their main takeaways

A County health officer was asked by a Sacramento Supervisor to explain why the COVID data showed the vaccine made things worse

Steve Kirsch's newsletter • 12 implied HN points • 31 Oct 24

There is no clear medical reason for COVID vaccines to prevent infection. Natural infections can create immunity, but not the kind from an injected vaccine.
After vaccines were given out, the data showed that the rate of deaths actually increased and stayed the same for a year, even though it was going down before the vaccines.
Some people in the medical field believe vaccines can cause harm, but are pressured not to publish their findings because of funding and institutional pressures.

Are We Getting Dumber?

Center for the Study of Partisanship and Ideology • 31 implied HN points • 30 Jan 24

🔬 Science Intelligence Fertility Genetics Psychology Data Analysis

There is a negative correlation between IQ and fertility across the world, suggesting a decline in intelligence over time.
More developed countries show a weaker decline in intelligence compared to less developed nations.
Embryo selection for intelligence could potentially offset the decline in intelligence, especially in wealthier countries.

The Infinite Data Hallucinator

Mindful Modeler • 59 implied HN points • 06 Dec 22

🕹 Technology Machine Learning Data Analysis APIs Scripting Data generation

The concept of creating fictive datasets using GPT-3 for testing ML models and educational purposes is explored in 'The Infinite Data Hallucinator'.
The 'Infinite Data Hallucinator' is a Jupyter notebook script that leverages the OpenAI API and pandas DataFrame to generate datasets based on a user-provided prompt.
While the generated datasets may have superficial coherence, they are not entirely realistic, and there are limitations due to token limits when creating larger datasets.

Data at Depth - Newsletter 2 (Dec 13, 2023)

Data at Depth • 19 implied HN points • 13 Dec 23

🕹 Technology Data Analysis Newsletter

Diversification is emphasized in the newsletter as an important concept.
The newsletter mentions a mention of new subscribers and exciting updates.
The post encourages readers to engage further by offering a 7-day free trial.

Oxford Nanopore Simplex Overall Error Rate ~3%? (Best model - New dataset)

ASeq Newsletter • 43 implied HN points • 12 Nov 23

🔬 Science Genomics Data Analysis

The overall error rate of Oxford Nanopore Simplex is around 3%, higher than the claimed 0.5% by the company
Filtering of data can significantly improve error rates, but with a potential throughput cost to consider
Duplex reads show a lower error rate compared to Simplex, making it a preferable option despite a throughput hit

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Bad Storms, Bad Science

Natural Selections • 12 implied HN points • 22 Oct 24

🔬 Science Climate Science Environmental Studies Scientific Method Data Analysis Research Ethics

Climate science often relies on models that may not fully prove human actions are the main cause of temperature increases. It's important to question what we assume about these models.
Some media outlets present conclusions about climate change as facts, which can mislead people. They may not explore other possible reasons for climate events.
True science should consider multiple explanations for observations instead of insisting on a single cause. It's essential to keep an open mind in scientific discussions.

Decoding Holiday Promos: A Data Score Interview with Flywheel on Fashion Industry Pricing Trends

The Data Score • 19 implied HN points • 11 Dec 23

💼 Business Retail Data Analysis Fashion Industry Marketing

The fashion industry in the US is promoting more aggressively this holiday season, with an increase in the percentage of products discounted and a decrease in average percentage markdown compared to last year.
48% of fashion retailers are promoting more aggressively this year, while 48% are promoting less aggressively, showing variations in promotional strategies among different brands.
Flywheel's web-mined pricing data indicates a response to the holiday season through increased discounting activity, leading to a greater percentage of products being sold out.

The future of work

Sunday Letters • 19 implied HN points • 11 Dec 23

💼 Business Job Market Data Analysis Professional development Innovation

The job market is always changing, just like it did when agriculture jobs shrank a century ago. People need to adapt and learn new skills to keep up.
Everyone now has the chance to do data analysis, which is great for innovation. Fast and low-cost experiments help us find unexpected insights.
Understanding basic concepts like mean vs median is becoming more important. It helps people ask better questions and make sense of the data they encounter.

Disinformation Handbook: A Concise Guide to Countering Disinformation (2)

Natto Thoughts • 19 implied HN points • 07 Dec 23

🕹 Technology Disinformation Artificial Intelligence Social media Fact-checking Data Analysis

The post discusses disinformation and how it can harm individuals and society.
Tips are provided to detect and avoid disinformation, including advice on how to investigate sources and spot deepfakes.
Various professionals like litigators, intelligence analysts, fact-checkers, and historians, provide valuable insights for countering disinformation.

What happens when health authorities are forced to answer my questions?

Steve Kirsch's newsletter • 12 implied HN points • 19 Oct 24

🏥 Health Politics Public Health COVID-19 Vaccination Data Analysis Health Policy

Health authorities may avoid answering tough questions about vaccine effectiveness. It's important to push for clear and honest responses.
Data from nursing homes suggests that COVID vaccinations did not significantly reduce deaths. This raises concerns about the actual impact of the vaccines.
There are claims that more vaccinations could be linked to increased COVID infections. It's crucial to understand why vaccination rates and infection rates may not align as expected.

Excelibur

davidj.substack • 71 implied HN points • 17 May 23

🕹 Technology Data Analysis Software Development User Interface Data Management BI Tools

Excel scalability can be improved by integrating technologies like DuckDB for handling larger datasets.
Enhancing data cleanliness through exposing hidden issues to the user for resolution.
Implementing a full semantic layer in Excel could make data pulling easier and more secure.

Alone, together, in the data

Data Thoughts • 79 implied HN points • 21 Oct 22

🕹 Technology Data Analysis Software Development Community Building Open Source Tech Conferences

Working in data often feels lonely, since a lot of the work is done solo on a computer, but there's magic in that solitude.
Events and communities bring people together, making these lonely moments feel connected and meaningful, especially in the data field.
The joy of working with data comes from the love of the craft itself, not just the outcomes or recognition, and that passion can survive even in tough times.

Here’s What Happened: TikTok Ban, Reddit Pro, Outages and the best Ads of the week

The Social Juice • 24 implied HN points • 10 Mar 24

🕹 Technology Social media AI Marketing Advertising Data Analysis

There is speculation about a TikTok ban in the US, with a possible crackdown bill in discussion.
Reddit introduced Reddit Pro, a new social and data toolkit for businesses.
Several major platforms like Facebook, Instagram, LinkedIn, and YouTube experienced global outages recently, impacting user experience.

Tip: Turning on Search Job Mode in the Microsoft Sentinel Logs Blade

Rod’s Blog • 19 implied HN points • 28 Nov 23

🕹 Technology Software Tools Data Analysis Cybersecurity

Search Jobs in Microsoft Sentinel help search through large datasets for specific events matching criteria.
Search Jobs have their own dedicated section in the Microsoft Sentinel menu blades, reflecting their importance.
Turning on Search Job Mode in Microsoft Sentinel Logs Blade streamlines searching with just a simple toggle switch.

Surveillance firm proposes "Border GPT"

All-Source Intelligence Fusion • 61 implied HN points • 21 Jun 23

🕹 Technology Surveillance Data Analysis Artificial Intelligence Privacy Concerns

Surveillance firm proposes 'Border GPT' for border agents to use language models on traveler data.
Different panel members have varying opinions on the integration of AI and surveillance tech in border enforcement.
Importance of engaging tech companies with border enforcement agencies for efficient use of resources.

Asking Rents Mostly Unchanged Year-over-year

CalculatedRisk Newsletter • 23 implied HN points • 06 Mar 24

💰 Finance Real Estate Market Trends Data Analysis Housing Market Economic Indicators

Asking rents have remained mostly stable year-over-year.
Rents for apartments show a slight decline year-over-year, but have picked up recently.
Single-family rental prices have seen slow but consistent growth year over year.

Fun with Community Notes data

Conspirador Norteño • 52 implied HN points • 12 Aug 23

🕹 Technology Data Analysis Social media Online Content User engagement

X/Twitter provides full Community Notes data for download and analysis
Community Notes usage has been increasing steadily since late 2022
Right-wing accounts are more frequently labeled with multiple Community Notes compared to left-wing accounts

What people are not seeing about TimeGPT

Three Data Point Thursday • 19 implied HN points • 16 Nov 23

🕹 Technology AI Data Analysis Machine Learning Forecasting Model development

Time series models, like TimeGPT, are advancing and will provide a significant boost in machine learning capabilities.
Adding time as a feature in models can enhance data analysis due to the information richness of recent data.
Although skepticism exists around time series machine learning models, advancements in generic models like TimeGPT are removing some barriers.

This school shooting data was featured by The Economist

School Shooting Data Analysis and Reports • 19 implied HN points • 15 Nov 23

🇺🇸 U.S. Politics Gun Violence Security Measures School Safety Data Analysis

School shootings go beyond high profile incidents like Parkland, impacting hundreds of schools with lockdowns and swatting hoaxes, creating a broader emotional and social toll on students.
Swatting, false 911 calls to trigger police response, poses a real danger to schools and has become a widespread issue, including multi-state serial swattings.
Collaboration between The Economist and the K-12 School Shooting Database sheds light on the increasing security spending in schools, revealing the mismatch between rising security measures and the continued occurrences of shootings.

Which mRNA vaccine is safer?

Steve Kirsch's newsletter • 5 implied HN points • 10 Jan 25

🏥 Health Politics Vaccines Public Health Safety Data Analysis Policy

The Moderna vaccine might be riskier than the Pfizer vaccine based on some studies, suggesting it has a higher chance of serious side effects.
Recent information indicates that the safety comparison between the two vaccines might not be as clear as previously thought.
Being updated with new data is important for anyone who may help others decide which vaccine to take.

Permission to Hate

Marginally Compelling • 41 implied HN points • 03 Oct 23

🇺🇸 U.S. Politics Partisanship Policy Covid data Vaccination rates Data Analysis

The focus on partisanship in Covid results gives people moral permission to hate their neighbors.
Covid restrictions based on partisanship did not necessarily save lives as thought.
Hating based on political party lines may distract from broader factors like income and education disparities.

Zero ELT could be the death of the Modern Data Stack

The Orchestra Data Leadership Newsletter • 19 implied HN points • 13 Nov 23

🕹 Technology Data Analysis Data Tools Data Management Data Integration

Zero ELT aims to streamline data processing by eliminating traditional extraction, loading, and transformation tools.
Zero ELT tools are evolving to focus more on use-case specialization rather than functional grounds, leading to a trade-off between stack complexity and having the best tool for the job.
Zero ELT tools, while promising in simplifying processes, may create data silos, lack interoperability with other tools, and bring about stack complexity issues.

Why do confounders always work to make the COVID vaccines look unsafe? I asked an expert!

Steve Kirsch's newsletter • 11 implied HN points • 15 Oct 24

🏥 Health Politics Vaccines Safety Data Analysis Public Health Research

Confounders are factors that can distort data, making vaccines seem unsafe, but they should affect results randomly. It raises questions about why they only appear to show a negative impact on vaccines.
There is a significant difference in mortality rates between different vaccine brands, suggesting there may be deeper issues like manufacturing defects or distribution biases impacting safety results.
Despite individual observations of negative vaccine effects, people are often told to trust aggregated data from authorities, which can lead to doubts about the reliability of personal experiences and observations.

Most can learn analysis, but won't become analysts

Counting Stuff • 54 implied HN points • 04 Jul 23

🕹 Technology Data Analysis Data science Career development Education

Everyone can learn to analyze, but not everyone will make good analysts.
Analysis is fundamental and necessary in daily life and professional settings.
Being a data analyst requires juggling different domains and approaches.

Prompting GPT-4 For Chart Image Analysis: Is It Up To The Challenge?

Data at Depth • 19 implied HN points • 10 Nov 23

🕹 Technology Artificial Intelligence Data Analysis

GPT-4 can now analyze and interpret image data, a useful new capability.
There is a focus on utilizing GPT-4 for analyzing line and bar chart images specifically.
Consider exploring the tool's performance with different types of visual data to assess its effectiveness fully.

Don't Run Coibion-Gorodnichenko Regressions with Micro Data

inexactscience • 39 implied HN points • 27 Mar 23

💰 Finance Economics Data Analysis Statistics Research Methods Forecasting

Running Coibion-Gorodnichenko regressions with individual data can lead to misleading results. It's important to use appropriate data types to avoid confusion in the findings.
Individual forecasts tend to produce negative results compared to positive results in average forecasts. This means that the insights from these regressions can differ significantly based on the data used.
The methodology is sensitive to noise and measurement errors, which can skew results. Researchers need to be cautious and robust in their approach to ensure accurate interpretations.

Latch Registry: An integrated database for multi-omics

LatchBio • 36 implied HN points • 26 Oct 23

🕹 Technology Data Management Biotech APIs Data Analysis

Managing multi-omics data is challenging as organizations grow
Existing solutions fall short due to lack of dynamic linking and validation
Latch Registry offers an integrated database solution for multi-omics data management

Microsoft Sentinel SOC 101: How to Detect and Mitigate Zero-day Exploits with Microsoft Sentinel

Rod’s Blog • 19 implied HN points • 10 Oct 23

🕹 Technology Security Cloud Computing Threat Detection Data Analysis Automation

Zero-day exploits are dangerous because they exploit unknown software vulnerabilities and can have severe consequences like data breaches and system disruptions.
To protect against zero-day exploits, organizations can monitor reported vulnerabilities, install next-generation antivirus solutions, perform rigorous patch management, segment networks with firewalls, and deploy advanced endpoint protection solutions.
Microsoft Sentinel, a cloud-native SIEM solution, can help organizations protect against zero-day exploits by collecting data at cloud scale, detecting threats with analytics and intelligence, and investigating and responding with automation and orchestration.

[Research Update] Sparse Autoencoder features are bimodal

From AI to ZI • 19 implied HN points • 22 Jun 23

🔬 Science Machine Learning Data Analysis Research Neural Networks

Low-MCS features in sparse autoencoders may be random or unrelated to the feature dictionary.
MCS scores of features in small dictionaries against larger ones show high correlation.
Increasing the number of features in a dictionary finds more high-MCS features, but even more low-MCS features.

Must Learn KQL Part 10: The Count Operator

Rod’s Blog • 19 implied HN points • 31 May 23

🕹 Technology Data Analysis

Using the count operator in KQL can help understand the overall impact of a situation by providing the exact number of occurrences of a specific event or data in a table.
The count operator syntax is simple, with just the table name followed by the count operator, making it easy to implement in queries.
Adding the count operator to queries can significantly enhance their impact by providing summarized, relevant data instead of rows of information to manually sift through.

Why Basic Inventory Data isn't so Easy

Mike Talks AI • 19 implied HN points • 23 Mar 23

💼 Business Inventory Management Supply Chain Data Analysis Risk management

Inventory data can have multiple demand streams to consider.
Watch out for the bullwhip effect that can inflate demand and lead to excess inventory.
Be mindful of calendars and time buckets to avoid unexpected spikes in inventory.

Must Learn KQL Part 8: The Where Operator

Rod’s Blog • 19 implied HN points • 31 May 23

🕹 Technology Data Analysis

The Where Operator in KQL is essential for filtering and retrieving exact, actionable data, improving query performance.
When learning KQL, it's beneficial to type out queries character-by-character to solidify new knowledge.
Consider using the KQL Playground as a learning environment to avoid frustrations with example queries not showing results.

Series A activity: Week of June 5, 2023

Magid and Co • 19 implied HN points • 12 Jun 23

💼 Business Investment Deals Data Analysis

This post provides data on Series A deals done in the last week.
The information covers Series A deals worldwide (excluding China) where companies raised over $5M and are not focused on therapeutics.
Readers can subscribe for free to receive new posts and support the author's work.

Trial Balloons

Brain Lenses • 19 implied HN points • 07 Mar 23

🇺🇸 U.S. Politics Policy Public Relations Decision-making Media Data Analysis

A trial balloon is a test of messaging or direction to gauge public response.
Using trial balloons can help predict reactions to potential decisions.
Trial balloons are widely used in politics, business, and other areas to shape public opinion.

Must Learn KQL Part 7: Schema Talk

Rod’s Blog • 19 implied HN points • 31 May 23

🕹 Technology Data Analysis Cloud Computing Query Language

Understanding the table schema in KQL is vital as it helps in finding data in an organized manner with the use of columns and types.
KQL column types are basic, time, and complex, and knowing them alters the query approach for specific columns.
The UI in KQL provides shortcuts for querying tables, expanding tables to view schema, using functions like stored procedures, and filtering data columns.

Objective setting, breakfast buffets and AI limits

Datent • 19 implied HN points • 06 Jul 23

🕹 Technology AI Automation Machine Learning Data Analysis

Objective setting is a critical skill in the AI era but can be difficult to master.
When setting objectives for AI, consider the potential for unintended consequences.
AI tools like AutoGPT show the importance of human oversight and the need for careful objective setting.

Interactive Mapping Tutorial With Python: Visualizing US Education Trends

Data at Depth • 19 implied HN points • 08 Jun 23

🕹 Technology Data Analysis Data Visualization Python

Data visualization skills are crucial for modern data analysis, and mapping skills are a valuable addition to visualization abilities.
Python libraries like Folium, Plotly, and Dash can be used for effective display of data.
Interactive mapping tutorials using Python can help in visualizing US education trends with tools like Folium, Plotly, and Dash.

YIMBies Overpromise

Wooly's Post Repository • 19 implied HN points • 23 Jul 23

💼 Business Real Estate Data Analysis Housing Market Urban Development Economics

The data on housing prices and construction can be confusing and counterintuitive, leading to difficulties in drawing clear conclusions.
YIMBY goals require a significant amount of construction to impact housing prices, but achieving such high construction rates can be challenging.
Confidence in real estate research should be lowered due to the complexity and potential errors in the data, making it important to approach conclusions with caution.

One Thing Web-3 Could Learn From the Space X Starship Launch.

Athena Scale • 19 implied HN points • 20 Apr 23

🕹 Technology Innovation Data Analysis Web Development

SpaceX's success lies in data gathering and analysis.
In Web-3, gather intel and data to improve your project over time.
Great products take time to develop and succeed.

The hottest Data Analysis Substack posts right now

Steve Kirsch's newsletter • 12 implied HN points • 31 Oct 24

Center for the Study of Partisanship and Ideology • 31 implied HN points • 30 Jan 24

Mindful Modeler • 59 implied HN points • 06 Dec 22

Data at Depth • 19 implied HN points • 13 Dec 23

ASeq Newsletter • 43 implied HN points • 12 Nov 23

Natural Selections • 12 implied HN points • 22 Oct 24

The Data Score • 19 implied HN points • 11 Dec 23

Sunday Letters • 19 implied HN points • 11 Dec 23

Natto Thoughts • 19 implied HN points • 07 Dec 23

Steve Kirsch's newsletter • 12 implied HN points • 19 Oct 24

davidj.substack • 71 implied HN points • 17 May 23

Data Thoughts • 79 implied HN points • 21 Oct 22

The Social Juice • 24 implied HN points • 10 Mar 24

Rod’s Blog • 19 implied HN points • 28 Nov 23

All-Source Intelligence Fusion • 61 implied HN points • 21 Jun 23

CalculatedRisk Newsletter • 23 implied HN points • 06 Mar 24

Conspirador Norteño • 52 implied HN points • 12 Aug 23

Three Data Point Thursday • 19 implied HN points • 16 Nov 23

School Shooting Data Analysis and Reports • 19 implied HN points • 15 Nov 23

Steve Kirsch's newsletter • 5 implied HN points • 10 Jan 25

Marginally Compelling • 41 implied HN points • 03 Oct 23

The Orchestra Data Leadership Newsletter • 19 implied HN points • 13 Nov 23

Steve Kirsch's newsletter • 11 implied HN points • 15 Oct 24

Counting Stuff • 54 implied HN points • 04 Jul 23

Data at Depth • 19 implied HN points • 10 Nov 23

inexactscience • 39 implied HN points • 27 Mar 23

LatchBio • 36 implied HN points • 26 Oct 23

Erdmann Housing Tracker • 63 implied HN points • 31 Mar 23

Rod’s Blog • 19 implied HN points • 10 Oct 23

From AI to ZI • 19 implied HN points • 22 Jun 23

Rod’s Blog • 19 implied HN points • 31 May 23

Mike Talks AI • 19 implied HN points • 23 Mar 23

Rod’s Blog • 19 implied HN points • 31 May 23

Magid and Co • 19 implied HN points • 12 Jun 23

Brain Lenses • 19 implied HN points • 07 Mar 23

Rod’s Blog • 19 implied HN points • 31 May 23

Datent • 19 implied HN points • 06 Jul 23

Data at Depth • 19 implied HN points • 08 Jun 23

Wooly's Post Repository • 19 implied HN points • 23 Jul 23

Athena Scale • 19 implied HN points • 20 Apr 23