The hottest Data Analytics Substack posts right now

And their main takeaways
Category
Top Business Topics
Big Technology 4003 implied HN points 07 Feb 25
  1. ChatGPT is seeing a big surge in usage after some slow months. It’s now doing much better than its competitors.
  2. Recent data shows ChatGPT has reached a key turning point in its growth. This is a positive shift that many are noticing.
  3. The chatbot now attracts more users and interest, making it a front-runner in the AI space. Its popularity is on the rise.
Inside Data by Mikkel Dengsøe 24 implied HN points 13 Feb 25
  1. Your data team size should be about 1-5% of your total company staff. Fintech companies usually have a higher percentage of data roles.
  2. The mix of different data roles is important. Having too many analysts can slow things down, while too many engineers might not deliver useful insights.
  3. Data salaries in Europe vary by experience. For example, a junior data role typically pays about $70k, while senior roles can reach $110k or more.
VuTrinh. 279 implied HN points 14 Sep 24
  1. Uber evolved from simple data management with MySQL to a more complex system using Hadoop to handle huge amounts of data efficiently.
  2. They faced challenges with data reliability and latency, which slowed down their ability to make quick decisions.
  3. Uber introduced a system called Hudi that allowed for faster updates and better data management, helping them keep their data fresh and accurate.
Chartbook 300 implied HN points 21 Jan 25
  1. The Bloomberg Economic Surprise Index for the US shows how unexpected events in the economy can change predictions. It's important to pay attention to these surprises to get a better understanding of the current economic climate.
  2. Understanding when threats are effective or not can help in managing situations better. Knowing the right time to take action can make a big difference in outcomes.
  3. Quantum technology is being compared to AI as a new frontier in innovation. It's exciting to think about how these technologies might change our future.
benn.substack 639 implied HN points 27 Dec 24
  1. Data-driven companies get a lot of attention, but many people still prefer investing in companies led by experienced individuals. This shows that experience holds significant value in business decisions.
  2. People like to be seen as unique or contrarian, but they often know what others like. This means that even when choosing something different, they still have a sense of the mainstream.
  3. There’s a funny perspective on what robots are, with younger generations seeing different meanings in technology compared to older ones. What one generation sees as a robot, another might just see as a gadget.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
VuTrinh. 399 implied HN points 20 Aug 24
  1. Discord started with its own tool called Derived to manage data, but it found this system limited as it grew. They needed a better way to handle complex data tasks.
  2. They switched to using popular tools like Dagster and dbt. This helped them automate and better manage their data processes.
  3. With the new setup, Discord can now make changes quickly and safely, which improves how they analyze and use their vast amounts of data.
The Data Ecosystem 439 implied HN points 28 Jul 24
  1. Data quality isn't just a simple fix; it's a complex issue that requires a deep understanding of the entire data landscape. You can't just throw money at it and expect it to get better.
  2. It's crucial to identify and prioritize your most important data assets instead of trying to fix everything at once. Focusing on what truly matters will help you allocate resources effectively.
  3. Implementing tools for data quality is important but should come after you've set clear standards and strategies. Just using technology won’t solve problems if you don’t understand your data and its needs.
Elevate 1113 implied HN points 09 Jan 24
  1. Effective managers have key traits that significantly impact employee performance, happiness, and retention, as proven by Google's Project Oxygen.
  2. Soft skills like coaching, communication, and support are more valued than technical expertise by employees, emphasizing the importance of emotional intelligence in management.
  3. Using rigorous people analytics, organizations can identify and develop high-impact management behaviors specific to their unique culture, leading to improved leadership and employee satisfaction.
VuTrinh. 119 implied HN points 16 Jul 24
  1. Meta uses a complex data warehouse to manage millions of tables and keeps data only as long as it's needed. Data is organized into namespaces for efficient querying.
  2. They built tools like iData for data discovery and Scuba for real-time analytics. These tools help engineers find and analyze data quickly.
  3. Data engineers at Meta develop pipelines mainly with SQL and Python, using internal tools for orchestration and monitoring to ensure everything runs smoothly.
Substack 658 implied HN points 26 Jun 24
  1. Substack now has a feature that shows writers detailed statistics about their posts. This helps creators see how well their posts are doing and where new subscribers are coming from.
  2. There is a new Discussion tab that makes it easier for writers to engage with comments and interactions on their posts. This way, they can manage conversations in one place without searching through notifications.
  3. The Substack app is driving a lot of new subscriptions. The app helps users discover content and connects writers to their audience more effectively.
Inside Data by Mikkel Dengsøe 16 implied HN points 16 Jan 25
  1. Start by clearly defining how you will use data. This helps set the purpose for your data products.
  2. It's important to have clear ownership of data and understand what needs testing. This makes accountability easier.
  3. Continuously monitor and improve your data quality. Regular reviews help catch issues early and keep trust in your data.
Data Analysis Journal 687 implied HN points 08 Jan 24
  1. Becoming a data analyst or engineer through bootcamps is becoming less prevalent due to economic factors.
  2. Analytics leaders face challenges in setting boundaries and avoiding overlap with finance teams in accounting functions.
  3. Decentralized data team setups are generally more efficient, and the future may see more of this with changes in tax regulations.
The Data Ecosystem 119 implied HN points 19 May 24
  1. Investing in data is a strategic move, not just about spending money. It's important to align data efforts with business goals to see real value.
  2. When pitching for data investment, focus on the benefits it will bring. Clear communication of value can help rebuild trust with leadership.
  3. Measuring the success of data investments through defined KPIs is essential. This helps in making future improvement and investment decisions.
Space Ambition 119 implied HN points 17 May 24
  1. Earth observation is key for weather and climate studies. It helps scientists track weather patterns and understand climate change using data from satellites.
  2. Satellites are important for monitoring natural and human-made disasters. They provide real-time data that helps in managing disaster response and understanding impacts.
  3. Remote sensing data supports various sectors like finance, ecology, and infrastructure. It aids in resource management, economic predictions, and assessing environmental changes.
SUP! Hubert’s Substack 50 implied HN points 22 Nov 24
  1. Shift-left analytics means doing analysis early in the data process. This helps in getting insights faster and making quick decisions.
  2. It focuses on checking data quality right away, so only reliable data is used. This leads to more accurate insights and avoids problems caused by bad data.
  3. Collaboration between teams is encouraged in this approach. By working together from the start, everyone can ensure their analyses are useful and aligned with business goals.
Data Analysis Journal 569 implied HN points 03 May 23
  1. Event-based analytics is crucial for understanding user behavior and product performance.
  2. Session-based analytics focus on website traffic while event-based analytics track user interactions like clicks and actions.
  3. Implementing and maintaining event-based analytics can be challenging due to issues with data integration and interpretation.
HyperArc 39 implied HN points 11 Jul 24
  1. A metrics layer helps standardize how companies measure data, making it easier for everyone to understand what is important. It can automate calculations, like rolling averages, which saves time and reduces confusion.
  2. Traditional business intelligence tools often lose useful underlying information, which makes it hard to understand how certain metrics were created. More context is needed to ensure decisions are well-informed and based on complete data.
  3. HyperArc offers a solution by capturing the team's insights and reasoning during analysis. It helps keep track of not just the final metrics, but also the thought process behind them, making it easier to revisit and understand decisions in the future.
The Data Jargon Newsletter 158 implied HN points 05 Mar 24
  1. Data lakes can be convenient but often lead to problems when trying to manage the data effectively. Keeping things simple with familiar tools can help make the data more useful.
  2. Using Dagster and DuckDB allows you to process data efficiently without complicated setups. You can do key tasks like aggregation and data cleaning right in your data flow.
  3. It's important to consider memory limits and choose the right file formats, like Parquet, for better processing. This way, you can keep your data pipeline running smoothly and avoid needless costs.
Cybernetic Forests 279 implied HN points 05 Nov 23
  1. Generative AI is essentially a new form of Big Data, emphasizing pattern analysis to automate processes.
  2. The expansion of data is essential for the existence of generative AI tools, demonstrating a rebranding of data analytics into AI.
  3. The tech industry's focus on data monetization and predictive analytics has led to virtual interactions that distance us from real human connection and community.
timo's substack 294 implied HN points 28 Feb 23
  1. Marketing analytics, BI, and product analytics have different requirements for source data and data handling.
  2. Product analytics involves more exploration and pattern-finding compared to marketing analytics and BI.
  3. Adopting product analytics requires a different approach, mindset, and tool compared to traditional analytics setups.
Data Thoughts 3 HN points 10 Sep 24
  1. Analytics should be handled like an assembly line to make it more efficient and accessible. This means creating standard processes to measure and track important business metrics.
  2. Most companies need to focus on basic descriptive analytics, which involves identifying and measuring key metrics. These metrics will help businesses understand what drives their success.
  3. Having well-defined metrics is essential before deeper analysis can happen. Insights from data come from well-measured processes, allowing teams to explore and understand their business better.
VTEX’s Tech Blog 99 implied HN points 10 Mar 24
  1. VTEX successfully scaled its monitoring system to handle 150 million metrics using Amazon's Managed Service for Prometheus. This helped them keep track of their numerous services efficiently.
  2. By adopting this system, VTEX cut its observability expenses by about 41%. This shows that smart choices in technology can save money.
  3. The new architecture allows VTEX to respond to problems faster and reduces the chances of system failures. It increased the reliability of their metrics, making everyday operations smoother.
timo's substack 117 implied HN points 06 Feb 24
  1. Data modeling for event data involves handling various source data and supporting diverse analysis use cases.
  2. Event data modeling can be organized into layers, from raw source data to consumption-ready data for analytics tools.
  3. Qualifying events to activities in event data modeling helps improve data usability and user experience in analytics tools.
Data Science Weekly Newsletter 319 implied HN points 07 Jul 23
  1. Generative design is making strides in drug discovery, but there are still challenges to address for better outcomes.
  2. The UK government is investing in a Foundation Model Taskforce to harness AI for societal benefits and safety.
  3. Keeping updated with developments in data science, such as new models and applications, is essential for professionals in the field.
Data at Depth 79 implied HN points 21 Mar 24
  1. The newsletter shares the creator's journey, including an increase in followers on Medium and steady Substack subscribers.
  2. The author discusses their recent creative projects and articles, reflecting on the title creation process.
  3. Readers can access a 7-day free trial to explore the full post archives of the Data at Depth newsletter.
The Orchestra Data Leadership Newsletter 79 implied HN points 18 Mar 24
  1. CEOs are moving away from hiring full data teams and are opting for small consultancies to set up their data stack, reducing risk and cost.
  2. One-person data teams in startups face overwhelming responsibilities, leading to chaos and potentially costly decisions.
  3. New technologies like Orchestra help single-person data teams maintain visibility and orchestration without expensive tools, accelerating the data value businesses receive.
Kyle Poyar’s Growth Unhinged 339 implied HN points 28 Feb 24
  1. Databox focused on improving activation, which led to a 10% increase from 30% to over 40%.
  2. Experimenting with the onboarding process, like allowing users to explore the product before connecting data, can significantly impact user engagement and activation rates.
  3. Implementing strategies like a reverse trial and a guided onboarding process can help not only improve activation rates but also showcase more value to users upfront.
VuTrinh. 79 implied HN points 10 Feb 24
  1. Snowflake separates storage and compute, allowing for flexible scaling and improved performance. This means that data storage can grow separately from computing power, making it easier to manage resources.
  2. Data can be stored in a cloud-based format that supports both structured and semi-structured data. This flexibility allows users to easily handle various data types without needing to define a strict schema.
  3. Snowflake implements unique optimization techniques, like data skipping and a push-based query execution model, which enhance performance and efficiency when processing large amounts of data.
Detection at Scale 79 implied HN points 05 Feb 24
  1. Transitioning from CEO to CTO to lead Panther's technical team, allowing more focus on delivering security outcomes via the product.
  2. Introduction of the concept of Detection Engineering, emphasizing reliability, scalability, and automation in security practices.
  3. Adapting Panther's approach to evolving security needs, enhancing code-driven detection for broader use and improving correlation, analytics, and visualization capabilities.
Sung’s Substack 139 implied HN points 14 Mar 23
  1. Data engineering involves many tedious tasks and manual checks, hindering the ability to reach a state of flow
  2. Software engineers have smoother workflows and better tools compared to data engineers, allowing them to focus on their work and enjoy the process
  3. There is potential to improve the data engineering workflow by implementing real-time monitoring, interactive previews, and streamlined processes to enhance the experience
Sarah's Newsletter 299 implied HN points 19 Apr 22
  1. Having modern tools doesn't guarantee providing value - it's more about how analytics teams use the tools to drive organizational change.
  2. The focus should be on delivering value to the organization rather than just building data platforms or using the most modern tools.
  3. Start simple with the minimum viable data stack and only add complexity when necessary - focus on solving real problems and evaluating tools based on problem-solving, maintenance, and scalability.