The hottest Data Analysis Substack posts right now

And their main takeaways
Category
Top Technology Topics
Steve Kirsch's newsletter 12 implied HN points 31 Oct 24
  1. There is no clear medical reason for COVID vaccines to prevent infection. Natural infections can create immunity, but not the kind from an injected vaccine.
  2. After vaccines were given out, the data showed that the rate of deaths actually increased and stayed the same for a year, even though it was going down before the vaccines.
  3. Some people in the medical field believe vaccines can cause harm, but are pressured not to publish their findings because of funding and institutional pressures.
Center for the Study of Partisanship and Ideology 31 implied HN points 30 Jan 24
  1. There is a negative correlation between IQ and fertility across the world, suggesting a decline in intelligence over time.
  2. More developed countries show a weaker decline in intelligence compared to less developed nations.
  3. Embryo selection for intelligence could potentially offset the decline in intelligence, especially in wealthier countries.
Mindful Modeler 59 implied HN points 06 Dec 22
  1. The concept of creating fictive datasets using GPT-3 for testing ML models and educational purposes is explored in 'The Infinite Data Hallucinator'.
  2. The 'Infinite Data Hallucinator' is a Jupyter notebook script that leverages the OpenAI API and pandas DataFrame to generate datasets based on a user-provided prompt.
  3. While the generated datasets may have superficial coherence, they are not entirely realistic, and there are limitations due to token limits when creating larger datasets.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Natural Selections 12 implied HN points 22 Oct 24
  1. Climate science often relies on models that may not fully prove human actions are the main cause of temperature increases. It's important to question what we assume about these models.
  2. Some media outlets present conclusions about climate change as facts, which can mislead people. They may not explore other possible reasons for climate events.
  3. True science should consider multiple explanations for observations instead of insisting on a single cause. It's essential to keep an open mind in scientific discussions.
The Data Score 19 implied HN points 11 Dec 23
  1. The fashion industry in the US is promoting more aggressively this holiday season, with an increase in the percentage of products discounted and a decrease in average percentage markdown compared to last year.
  2. 48% of fashion retailers are promoting more aggressively this year, while 48% are promoting less aggressively, showing variations in promotional strategies among different brands.
  3. Flywheel's web-mined pricing data indicates a response to the holiday season through increased discounting activity, leading to a greater percentage of products being sold out.
Sunday Letters 19 implied HN points 11 Dec 23
  1. The job market is always changing, just like it did when agriculture jobs shrank a century ago. People need to adapt and learn new skills to keep up.
  2. Everyone now has the chance to do data analysis, which is great for innovation. Fast and low-cost experiments help us find unexpected insights.
  3. Understanding basic concepts like mean vs median is becoming more important. It helps people ask better questions and make sense of the data they encounter.
Natto Thoughts 19 implied HN points 07 Dec 23
  1. The post discusses disinformation and how it can harm individuals and society.
  2. Tips are provided to detect and avoid disinformation, including advice on how to investigate sources and spot deepfakes.
  3. Various professionals like litigators, intelligence analysts, fact-checkers, and historians, provide valuable insights for countering disinformation.
Steve Kirsch's newsletter 12 implied HN points 19 Oct 24
  1. Health authorities may avoid answering tough questions about vaccine effectiveness. It's important to push for clear and honest responses.
  2. Data from nursing homes suggests that COVID vaccinations did not significantly reduce deaths. This raises concerns about the actual impact of the vaccines.
  3. There are claims that more vaccinations could be linked to increased COVID infections. It's crucial to understand why vaccination rates and infection rates may not align as expected.
Data Thoughts 79 implied HN points 21 Oct 22
  1. Working in data often feels lonely, since a lot of the work is done solo on a computer, but there's magic in that solitude.
  2. Events and communities bring people together, making these lonely moments feel connected and meaningful, especially in the data field.
  3. The joy of working with data comes from the love of the craft itself, not just the outcomes or recognition, and that passion can survive even in tough times.
The Social Juice 24 implied HN points 10 Mar 24
  1. There is speculation about a TikTok ban in the US, with a possible crackdown bill in discussion.
  2. Reddit introduced Reddit Pro, a new social and data toolkit for businesses.
  3. Several major platforms like Facebook, Instagram, LinkedIn, and YouTube experienced global outages recently, impacting user experience.
Rod’s Blog 19 implied HN points 28 Nov 23
  1. Search Jobs in Microsoft Sentinel help search through large datasets for specific events matching criteria.
  2. Search Jobs have their own dedicated section in the Microsoft Sentinel menu blades, reflecting their importance.
  3. Turning on Search Job Mode in Microsoft Sentinel Logs Blade streamlines searching with just a simple toggle switch.
All-Source Intelligence Fusion 61 implied HN points 21 Jun 23
  1. Surveillance firm proposes 'Border GPT' for border agents to use language models on traveler data.
  2. Different panel members have varying opinions on the integration of AI and surveillance tech in border enforcement.
  3. Importance of engaging tech companies with border enforcement agencies for efficient use of resources.
Three Data Point Thursday 19 implied HN points 16 Nov 23
  1. Time series models, like TimeGPT, are advancing and will provide a significant boost in machine learning capabilities.
  2. Adding time as a feature in models can enhance data analysis due to the information richness of recent data.
  3. Although skepticism exists around time series machine learning models, advancements in generic models like TimeGPT are removing some barriers.
School Shooting Data Analysis and Reports 19 implied HN points 15 Nov 23
  1. School shootings go beyond high profile incidents like Parkland, impacting hundreds of schools with lockdowns and swatting hoaxes, creating a broader emotional and social toll on students.
  2. Swatting, false 911 calls to trigger police response, poses a real danger to schools and has become a widespread issue, including multi-state serial swattings.
  3. Collaboration between The Economist and the K-12 School Shooting Database sheds light on the increasing security spending in schools, revealing the mismatch between rising security measures and the continued occurrences of shootings.
Steve Kirsch's newsletter 5 implied HN points 10 Jan 25
  1. The Moderna vaccine might be riskier than the Pfizer vaccine based on some studies, suggesting it has a higher chance of serious side effects.
  2. Recent information indicates that the safety comparison between the two vaccines might not be as clear as previously thought.
  3. Being updated with new data is important for anyone who may help others decide which vaccine to take.
The Orchestra Data Leadership Newsletter 19 implied HN points 13 Nov 23
  1. Zero ELT aims to streamline data processing by eliminating traditional extraction, loading, and transformation tools.
  2. Zero ELT tools are evolving to focus more on use-case specialization rather than functional grounds, leading to a trade-off between stack complexity and having the best tool for the job.
  3. Zero ELT tools, while promising in simplifying processes, may create data silos, lack interoperability with other tools, and bring about stack complexity issues.
Steve Kirsch's newsletter 11 implied HN points 15 Oct 24
  1. Confounders are factors that can distort data, making vaccines seem unsafe, but they should affect results randomly. It raises questions about why they only appear to show a negative impact on vaccines.
  2. There is a significant difference in mortality rates between different vaccine brands, suggesting there may be deeper issues like manufacturing defects or distribution biases impacting safety results.
  3. Despite individual observations of negative vaccine effects, people are often told to trust aggregated data from authorities, which can lead to doubts about the reliability of personal experiences and observations.
inexactscience 39 implied HN points 27 Mar 23
  1. Running Coibion-Gorodnichenko regressions with individual data can lead to misleading results. It's important to use appropriate data types to avoid confusion in the findings.
  2. Individual forecasts tend to produce negative results compared to positive results in average forecasts. This means that the insights from these regressions can differ significantly based on the data used.
  3. The methodology is sensitive to noise and measurement errors, which can skew results. Researchers need to be cautious and robust in their approach to ensure accurate interpretations.
Rod’s Blog 19 implied HN points 10 Oct 23
  1. Zero-day exploits are dangerous because they exploit unknown software vulnerabilities and can have severe consequences like data breaches and system disruptions.
  2. To protect against zero-day exploits, organizations can monitor reported vulnerabilities, install next-generation antivirus solutions, perform rigorous patch management, segment networks with firewalls, and deploy advanced endpoint protection solutions.
  3. Microsoft Sentinel, a cloud-native SIEM solution, can help organizations protect against zero-day exploits by collecting data at cloud scale, detecting threats with analytics and intelligence, and investigating and responding with automation and orchestration.
Rod’s Blog 19 implied HN points 31 May 23
  1. Using the count operator in KQL can help understand the overall impact of a situation by providing the exact number of occurrences of a specific event or data in a table.
  2. The count operator syntax is simple, with just the table name followed by the count operator, making it easy to implement in queries.
  3. Adding the count operator to queries can significantly enhance their impact by providing summarized, relevant data instead of rows of information to manually sift through.
Rod’s Blog 19 implied HN points 31 May 23
  1. The Where Operator in KQL is essential for filtering and retrieving exact, actionable data, improving query performance.
  2. When learning KQL, it's beneficial to type out queries character-by-character to solidify new knowledge.
  3. Consider using the KQL Playground as a learning environment to avoid frustrations with example queries not showing results.
Rod’s Blog 19 implied HN points 31 May 23
  1. Understanding the table schema in KQL is vital as it helps in finding data in an organized manner with the use of columns and types.
  2. KQL column types are basic, time, and complex, and knowing them alters the query approach for specific columns.
  3. The UI in KQL provides shortcuts for querying tables, expanding tables to view schema, using functions like stored procedures, and filtering data columns.
Data at Depth 19 implied HN points 08 Jun 23
  1. Data visualization skills are crucial for modern data analysis, and mapping skills are a valuable addition to visualization abilities.
  2. Python libraries like Folium, Plotly, and Dash can be used for effective display of data.
  3. Interactive mapping tutorials using Python can help in visualizing US education trends with tools like Folium, Plotly, and Dash.
Wooly's Post Repository 19 implied HN points 23 Jul 23
  1. The data on housing prices and construction can be confusing and counterintuitive, leading to difficulties in drawing clear conclusions.
  2. YIMBY goals require a significant amount of construction to impact housing prices, but achieving such high construction rates can be challenging.
  3. Confidence in real estate research should be lowered due to the complexity and potential errors in the data, making it important to approach conclusions with caution.