The hottest Data Teams Substack posts right now

And their main takeaways
Category
Top Technology Topics
The Orchestra Data Leadership Newsletter 39 implied HN points 21 May 24
  1. Web scraping with AI can enhance intelligence gathering by efficiently collecting and processing data from various public sources on the internet.
  2. Leveraging Large Language Models (LLMs) can improve the accuracy and robustness of web scraping systems when dealing with changes in HTML code structure.
  3. Using tools like Nimble for web scraping allows for more efficient and accurate data collection by training models on different types of websites for specific use cases.
benn.substack 741 implied HN points 21 Apr 23
  1. Analysts should reflect on their role and avoid behaving like Jared Kushner
  2. Being a data analyst involves providing informed insights, not just being a 'nicer, kinder' Jared Kushner
  3. Focusing on keeping the company well-informed through regular updates can be more effective than traditional data reporting
benn.substack 511 implied HN points 28 Jul 23
  1. Data quality is a tradeoff in balancing stability and agility.
  2. Data resiliency tools like SDF focus on tracing data lineage to improve debugging and fixing issues.
  3. Managing messy data often requires making choices between stability and adaptability in data infrastructure.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
The Orchestra Data Leadership Newsletter 39 implied HN points 30 Dec 23
  1. Data teams are increasingly turning to low-code solutions to streamline data release pipelines, utilizing tools like Airflow but questioning the need for extensive code writing and infrastructure maintenance.
  2. The complex cloud environment has led to the development of specialized data tools, making the orchestration of data pipelines challenging and highlighting the importance of governance, data quality, and scalability.
  3. No-code solutions like dbt core and Hightouch are already integrated into many data tools, simplifying the orchestration process and indicating that the future of data architecture might involve a combination of workflow orchestrators and efficient data quality checks.
Data People Etc. 231 implied HN points 20 Mar 23
  1. Data teams are facing challenges with tool abandonment in the current economy.
  2. Databases remain crucial in the data stack, with less need for new, specialized tools.
  3. Building trust and bridging gaps between data and engineering teams is vital for successful data applications.
The Orchestra Data Leadership Newsletter 19 implied HN points 05 Nov 23
  1. Consider data contracts if your internal data changes often to ensure collaboration between software engineering and data engineering teams.
  2. If you have important metrics that depend on software engineering actions, like defining 'Active Users,' data contracts can help maintain data quality.
  3. In cases where software engineering and data engineering roles overlap, implementing data contracts can streamline data ingestion processes and improve data quality.
Inside Data by Mikkel Dengsøe 24 implied HN points 13 Feb 25
  1. Your data team size should be about 1-5% of your total company staff. Fintech companies usually have a higher percentage of data roles.
  2. The mix of different data roles is important. Having too many analysts can slow things down, while too many engineers might not deliver useful insights.
  3. Data salaries in Europe vary by experience. For example, a junior data role typically pays about $70k, while senior roles can reach $110k or more.
Three Data Point Thursday 19 implied HN points 20 Apr 23
  1. Dbt Labs acquired Transform to target a new market segment beyond analytics engineers.
  2. Tech companies typically expand by starting small, then broadening their market focus and adding features.
  3. Data is not the same as analytics; a top-down approach to making data vital in a company is crucial.