The hottest Data Analysis Substack posts right now

And their main takeaways
Category
Top Technology Topics
Gonzo ML 63 implied HN points 29 Jan 25
  1. The paper introduces a method called ACDC that automates the process of finding important circuits in neural networks. This can help us better understand how these networks work.
  2. Researchers follow a three-step workflow to study model behavior, and ACDC fully automates the last step which helps identify connections that matter for a specific task.
  3. While ACDC shows promise, it isn't perfect. It may miss some important connections and needs adjustments for different tasks to improve its accuracy.
Public Universal Friend 79 implied HN points 02 Sep 24
  1. Using a customer engagement platform like Customer.io can help marketers improve their targeting and maximize growth. It offers better data management and less need for technical support.
  2. Spring is a great time for businesses to focus on improving conversions through digital marketing strategies. Real-time data can help companies get more return on their investment.
  3. Personal connections and genuine interactions are valuable, even in business communication. Taking the time to show real interest can make a difference.
SeattleDataGuy’s Newsletter 447 implied HN points 08 Nov 24
  1. Data teams need to know the main numbers that matter for their business. This helps them understand how the company is performing.
  2. High-level metrics like revenue and expenses can seem too big to grasp. Breaking these down into smaller parts makes them easier to understand.
  3. These smaller, detailed metrics can reveal valuable insights that affect decisions and strategies for the business.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Independent SAGE continues 1418 implied HN points 20 Mar 24
  1. Independent SAGE has launched a Substack to share insights about Covid research and data. They aim to provide valuable information directly from experts to the public.
  2. They plan to post updates roughly every two weeks, including responses to important new research and news. This helps keep everyone informed about the ongoing situation.
  3. The Substack will remain free for subscribers, encouraging more people to stay updated on Covid developments and public health measures.
Engineering Enablement 21 implied HN points 05 Feb 25
  1. Metrics for developers should help improve their work experience, not just measure their output. Goodhart's Law reminds us that once metrics are tied to rewards, they can become misleading.
  2. Developer experience is more about effectiveness than happiness. Measuring how developers feel needs to focus on the frustrations they face, and not just on making them comfortable.
  3. Using benchmarks is important but context is key. Just like medical tests, numbers need interpretation to make sense; comparing different teams requires understanding their unique challenges.
Ground Truths 3980 implied HN points 19 Feb 24
  1. Polygenic risk scores can provide valuable information on high genetic risk for diseases like heart disease and cancer, beyond traditional clinical risk factors.
  2. The use of polygenic risk scores is advancing thanks to efforts like the eMERGE consortium, incorporating multi-ancestry data and rigorous validation.
  3. Actionable polygenic risk scores have the potential to reduce health disparities and enhance preventive strategies in medical practice.
Jakob Nielsen on UX 21 implied HN points 13 Feb 25
  1. AI models are getting better at reducing false information, called hallucinations. This means they are less likely to make things up over time.
  2. Bigger AI models generally make fewer mistakes. As AI technology improves, we can expect even fewer errors from future models.
  3. While waiting for better AI, improving user experience can help users spot and double-check misleading information, making it easier to trust AI outputs.
Wednesday Wisdom 113 implied HN points 01 Jan 25
  1. Relying too much on data can lead to wrong decisions because numbers don't always tell the full story. Sometimes, human judgment or understanding is needed.
  2. Data can create a false sense of certainty, making people ignore the uncertainties and assumptions behind those numbers. It's important to be honest about what the data truly represents.
  3. Setting goals based on numbers can make teams lose sight of the real-world processes they are supposed to improve. Chasing metrics blindly can lead to poor outcomes.
Conspirador Norteño 68 implied HN points 18 Jan 25
  1. You can spot fake followers on Bluesky by looking for accounts with similar join dates and generic profiles. These accounts often have no posts and repetitive bios.
  2. Using a method where you track the followers of suspected fake accounts can help identify whole networks of fake followers. By downloading and filtering their followers, you can map out these networks.
  3. The Bluesky platform has a real-time feature called the firehose that makes it easier to catch fake follower activity as it happens. However, this can give some false positives, so users need to be careful.
Chartbook 400 implied HN points 21 Oct 24
  1. The TIGER indices are showing a negative trend, indicating economic challenges ahead. This suggests that global economic recovery may be slower than expected.
  2. South Sudan is facing significant difficulties, highlighting ongoing humanitarian issues. These problems need urgent attention to improve the situation for its people.
  3. There are connections being made to the 1990s, suggesting that some current geopolitical situations may resemble past conflicts. This raises concerns about the repetition of history in today's world.
SemiAnalysis 7576 implied HN points 27 Sep 23
  1. Eroom's Law and Moore's Law are critical in Semiconductors and Drug Research, analyzing time, money, and output.
  2. Healthcare, a $4 trillion industry, lags behind in technological progress driven by Moore's Law.
  3. Illumina acquisition by Nvidia could bridge the gap in genomics, addressing bottlenecks and enabling full-stack healthcare solutions.
Franz likes to code 39 implied HN points 05 Sep 24
  1. If you're having trouble with the Google Trends Python package, you can switch to using Wikipedia's page view statistics instead. It's a reliable and official way to get data on search trends.
  2. Wikipedia provides a rich API that allows you to fetch daily or hourly view counts for specific articles. This can help analyze how topics gain interest over time.
  3. You can use a simple Python code to find the page views for any Wikipedia article, making it easy to replace Google Trends in your research and get the data you need.
Elizabeth Laraki 419 implied HN points 28 May 24
  1. Kerry Rodden, a UX researcher, helped YouTube understand how users navigated the site. By deeply analyzing user data, they found out what people really wanted from YouTube.
  2. One big surprise was that most YouTube sessions didn't start on the homepage. Instead, many users went directly to watch videos they found elsewhere on the internet.
  3. Kerry created clear visualizations of user data that showed how people moved through YouTube. This helped the company improve its homepage and focus on personalizing content for users.
In My Tribe 410 implied HN points 25 Jan 25
  1. Many experts believe that relying on government decisions can be inefficient because it often favors those with political power instead of addressing real needs.
  2. Inequality is a natural part of society, and efforts to eliminate it through government action can lead to problems, including promoting wokeness.
  3. Economic data can often be misleading due to measurement errors, making it hard to trust figures that inform important decisions like GDP or monetary policies.
Implications, by Scott Belsky 1356 implied HN points 04 Jan 24
  1. The future will be personalized to your preferences, with digital experiences tailored to you.
  2. Local OS-native AI models will improve everyday life and redefine consumer AI, focusing on personalization, trust, and privacy.
  3. Small brands will become more competitive with big brands, AI will influence purchase decisions, and education will undergo a significant transformation.
Richard Hanania's Newsletter 3657 implied HN points 12 Feb 24
  1. Social scientists often resort to statistical relationships when randomized experiments are not feasible, which can lead to flawed conclusions due to selection effects and confounding variables.
  2. Flawed data is often worse than having no data at all, as it can mislead individuals into making decisions based on inaccurate information.
  3. To form reasonable opinions on social, political, and economic issues, it is essential to prioritize well-grounded ideas backed by theoretical reasoning and empirical data over blindly following data from flawed social science research.
Import AI 439 implied HN points 06 May 24
  1. People are skeptical of AI safety policy as different views arise from the same technical information, making it important to consider varied perspectives.
  2. Chinese researchers have developed a method called SOPHON to openly release AI models while preventing finetuning for misuse, offering a solution for protecting against subsequent harm.
  3. Automating intelligence analysis through datasets like OpenStreetView-5M will enhance training machine learning systems for geolocation, leading to potential applications in both military intelligence and civilian sectors.
Software Design: Tidy First? 132 implied HN points 05 Dec 24
  1. Measuring lines of code in functions can be more complicated than expected. It's helpful to keep track of this while working on software projects.
  2. Looking for patterns in software, like Pareto distributions, can provide valuable insights. It's good practice to analyze your own code for these patterns.
  3. Documenting your findings is important. Sharing your experiences can help others who are trying to understand their software better.
Neeloy’s Substack 119 implied HN points 24 Jul 24
  1. Many International Math Olympiad gold medalists end up pursuing careers in different fields, not just in finance or academia. It's interesting to see how their paths vary after such early success.
  2. Data collection on these medalists shows a clear trend where China dominates in terms of gold medals, with a majority of their students achieving this top honor. This highlights the competitive environment in math education in that country.
  3. The dataset used to track these medalists has its limitations, particularly due to language and cultural barriers in finding information. However, the findings still provide valuable insights into the outcomes of these talented individuals.
The Counterfactual 199 implied HN points 27 Jun 24
  1. Always look at the whole distribution of data, not just the average. The average can be affected by extreme values, so it's crucial to see the bigger picture to understand what the data really tells us.
  2. Consider the baseline or reference point when evaluating numbers. Knowing how a number compares to others helps us understand if it's large or small, which gives us better context.
  3. Understand the story behind the data-generating process. This means recognizing the factors that led to the results we see, which helps in identifying possible biases or alternative explanations.
Jakob Nielsen on UX 27 implied HN points 30 Jan 25
  1. DeepSeek's AI model is cheaper and uses a lot less computing power than other big models, but it still performs well. This shows smaller models can be very competitive.
  2. Investments in AI are expected to keep growing, even with cheaper models available. Companies will still spend billions to advance AI technology and achieve superintelligence.
  3. As AI gets cheaper, more people will use it and businesses will likely spend more on AI services. The demand for AI will increase as it becomes more accessible.
Nail It and Scale It 119 implied HN points 22 Jul 24
  1. Many online advertising benchmarks are unreliable because they don't account for differences in pricing and offers. This means you might be comparing apples to oranges, leading to wrong conclusions.
  2. To get better benchmarks, focus on two key metrics: Cost-Per-Click (CPC) and Conversion Rate. These give you a clearer picture of how your ads are performing compared to others.
  3. Joining groups or talking to industry experts can help you find more accurate conversion rates for your products. Sharing data with peers is a good way to understand what's normal in your field.
Musings on Markets 1099 implied HN points 05 Jan 24
  1. All companies are included in data analysis to get a full picture, not just big ones. This helps avoid bias and shows a more accurate view of industries.
  2. The data covers many financial variables that help understand company decisions about investment, financing, and dividends. It also uses unique ways to calculate statistics for more accurate insights.
  3. The statistics are updated regularly to reflect the latest available information. Users should utilize the data wisely and be aware of any changes in accounting standards or currency issues.
Day One 758 implied HN points 24 Feb 24
  1. Building trust and authority through valuable content is essential for selling products or services online
  2. Utilizing testimonials and free high-quality content can greatly persuade potential customers to make a purchase
  3. Addressing objections, providing ongoing support, and reducing buyer's remorse are key to maintaining customer satisfaction and loyalty
Weight and Healthcare 818 implied HN points 10 Feb 24
  1. The study on Tirzepatide showed that weight loss for participants slowed after 36 weeks, with those switching to placebo experiencing weight regain while those continuing the drug had a slight weight reduction in the following 52 weeks.
  2. Side effects of Tirzepatide included gastrointestinal issues like nausea, diarrhea, constipation, and vomiting. Close to 82% of participants reported experiencing at least one adverse event during the treatment period.
  3. The study's findings indicate that a significant percentage of participants taking Tirzepatide did not meet the weight reduction thresholds, with a lack of diverse representation among participants and a lack of a weight-neutral comparator group presenting issues in the study design.
Nepetalactone Newsletter 1670 implied HN points 30 Apr 23
  1. There are two types of scientists: those who worship hierarchy and those who understand hierarchy is a cancer to the scientific method.
  2. The EMA found several objections to Pfizer's data, showing that it did not meet GMP standards.
  3. Concerns were raised by the EMA about Pfizer's data integrity, lack of biological characterization, and inconsistencies in the data provided.
Conspirador Norteño 128 implied HN points 06 Dec 24
  1. Monitoring the Bluesky firehose can help quickly spot fake accounts. By looking for repeated names and profiles, it's easier to identify spam activity.
  2. A large number of spam accounts often share similar biographies. One group had over a thousand accounts with variations of the same few phrases.
  3. Many spam accounts use stolen images as profile pictures. This makes them look less authentic and easier to identify as spam.
The Uncertainty Mindset (soon to become tbd) 99 implied HN points 24 Jul 24
  1. AI systems look like they can think independently, but they really can't. They are tools that need humans to make decisions about value.
  2. Meaning-making is a core human skill that AI lacks. Only humans can decide what actions are meaningful and worthwhile.
  3. When we treat AI as if it can make important decisions, we risk misusing it. It's crucial to keep humans involved in the decision-making process.
The AI Frontier 79 implied HN points 01 Aug 24
  1. Vibes-based evaluations are a helpful starting point for assessing AI quality, especially when specific metrics are hard to define. They allow for initial impressions based on user interactions rather than strict guidelines.
  2. Customers often have unique and unexpected requests that can't easily fit into predefined test sets. Vibes allow for flexibility in understanding real-world usage.
  3. While vibes are useful, they also have downsides, like strong first impressions and limited feedback. A mix of vibes and structured evaluations can provide a better overall understanding of an AI's performance.
Resilient Cyber 79 implied HN points 01 Aug 24
  1. The Exploit Prediction Scoring System (EPSS) helps predict how likely a software vulnerability is to be exploited. It provides a score, so organizations can focus on the vulnerabilities that really matter.
  2. Most vulnerabilities that are reported, about 94%, aren’t even exploited in real life. This means organizations waste a lot of resources on vulnerabilities that pose no threat, highlighting the importance of focusing on the ones that are actually exploited.
  3. The EPSS tool works better than older systems like the Common Vulnerability Scoring System (CVSS). It helps organizations prioritize their efforts because it brings more efficiency in vulnerability management.
The Security Industry 10 implied HN points 03 Feb 25
  1. HarvestIQ now combines two assistants into one, simplifying interactions for users. This helps reduce confusion and makes it easier to get information about cybersecurity vendors and products.
  2. Users can ask the Cyber Assistant for various tasks like product comparisons, SWOT analyses, and customized news summaries. These features aim to enhance decision-making in cybersecurity.
  3. The IT-Harvest Dashboard and HarvestIQ serve different purposes. The Dashboard is great for exploring detailed data, while HarvestIQ is more about getting direct answers and insights.
Jampa’s Substack 40 HN points 21 Aug 24
  1. Finding a place to live in a small, low-tech city can be really challenging. There aren't many real estate options or online listings, so one might need to explore the area by driving around.
  2. Using technology like OpenStreetMaps and AI can help in identifying neighborhoods and evaluating their quality. This can save a lot of time compared to traditional methods.
  3. It's important to check the neighborhood in person, even after using tech tools. Seeing the area first-hand can give a better understanding of what to expect and help find suitable homes.