The hottest SQL Substack posts right now

And their main takeaways
Category
Top Technology Topics
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Data Engineering Central 216 implied HN points 13 Feb 23
  1. Data Engineers often struggle with implementing unit tests due to factors like focus on moving fast and historical lack of emphasis on testing.
  2. Unit testable code in data engineering involves keeping functions small, minimizing side effects, and ensuring reusability.
  3. Implementing unit tests can elevate a data team's performance and lead to better software quality and bug control.
The Orchestra Data Leadership Newsletter 19 implied HN points 16 Nov 23
  1. SQL is a powerful data manipulation tool that has different dialects and evolved over time to fit various database software needs.
  2. New SQL tools like dbt, SQLMesh, and Semantic Data Fabric aim to improve data testing, quality, and governance in data engineering processes.
  3. The value in data engineering lies more in processes, culture, and diligence, rather than solely relying on fancy tools to prevent mistakes.
Leading Developers 3 HN points 13 Feb 24
  1. SQL skills are crucial for managers because they can help answer business questions, understand technical designs, and provide a huge return on effort invested.
  2. Don't stop with just learning joins in SQL. Advancing to using CTEs, window functions, and partitions can greatly enhance your ability to write complex queries.
  3. Window functions in SQL, such as ranking functions, aggregation functions, and positional functions, can help in advanced query writing by allowing calculations across sets of rows or returning a single value from a specific row within partitions.
ingest this! 1 HN point 19 Feb 24
  1. Build data apps using markdown and SQL with Evidence framework, offering a way to create polished data products.
  2. Explore the future synergy of knowledge graphs and large language models (LLMs) for enhanced technologies.
  3. Engage with the latest in data engineering by checking out a full exploration of the open-source data engineering landscape for 2024.
Reflective Software Engineering 0 implied HN points 12 Jan 24
  1. Having unit tests for SQL queries can help catch bugs introduced during code refactorings or changes.
  2. When writing unit tests for SQL queries, focus on testing the specific parts responsible for building the query rather than the entire method.
  3. Refactoring code for testability can involve moving pure functions outside of the class for easier testing and simplifying methods to focus on specific tasks.
Conserving CPU's cycles ... 0 implied HN points 26 Jun 24
  1. Incremental sort was added in PostgreSQL 2020 to enhance sorting strategies and improve efficiency in handling large datasets and analytical queries.
  2. Estimation instability in PostgreSQL's sort operations can lead to unexpected query plans and performance differences, emphasizing the importance of careful estimation.
  3. The vulnerability in PostgreSQL's optimizer code showcases how the choice of expression evaluation can impact query performance, highlighting a need for optimization improvements.
DataSketch’s Substack 0 implied HN points 07 Oct 24
  1. Window functions let you do calculations across rows related to your current row without losing any details. This helps you get both summarized and detailed data at the same time.
  2. Using window functions can make complex data tasks easier, like ranking items or finding running totals. They are very helpful in fields like healthcare to analyze patient data and improve efficiency.
  3. It's important to test how window functions perform on a smaller dataset before using them widely. Combining multiple window functions and partitioning your data smartly can also boost performance.
The Orchestra Data Leadership Newsletter 0 implied HN points 31 Oct 23
  1. Understanding the importance of incremental models for managing big data is crucial to efficiently running complex queries and maintaining data quality.
  2. Design patterns in data modeling, such as Star Schema and Data Vault, play a significant role in how dbt models are structured and managed.
  3. Using Jinja templating and implementing continuous data integration processes are key elements in handling big models effectively and ensuring data reliability.