The hottest SQL Substack posts right now

And their main takeaways

A Selection Of SQL Tutorials - Issue 163

Data Analysis Journal • 550 implied HN points • 27 Sep 23

Practicing SQL in your local database is the best way to improve.
There are many free SQL tutorials and courses available for beginners to advanced users.
Interview preparation for SQL roles is essential, and there are resources to help you prepare.

Introduction To Analytics Engineering

Data Analysis Journal • 353 implied HN points • 22 Mar 23

🕹 Technology Data Analysis Data science SQL Data Engineering

Analytics engineers bridge the gap between data engineers and data analysts by focusing on producing high-quality data.
Analytics engineers use tools like dbt to streamline data modeling, testing, and documentation.
Data quality is crucial in decision-making, making analytics engineering more important than ever.

Top 10 Advanced SQL Functions For Data Analysis - Issue 151

Data Analysis Journal • 334 implied HN points • 05 Jul 23

🕹 Technology Data Analysis SQL Statistics

It takes time and practice to advance from basic SQL to proficient level.
Using Python or R along with BI tools can save time in data analysis.
SQL is a crucial skill for anyone involved in data analysis and statistics.

How To Pass A SQL Interview For A Data Scientist Position - Issue 140

Data Analysis Journal • 275 implied HN points • 19 Apr 23

🕹 Technology Data science SQL

Data science job interviews may test candidates on Python and SQL proficiency.
Technical coding interview questions for data science positions can include SQL challenges.
Being proficient in SQL and data analysis is essential for succeeding in a data scientist position.

Engineers want the ergonomics of SQL query languages. So why do NoSQL databases exist?

🔮 Crafting Tech Teams • 59 implied HN points • 18 Apr 24

🕹 Technology Databases SQL Cloud

Engineers desire the user-friendly nature of SQL for query languages.
NoSQL databases exist due to the different needs and structures of certain applications.
SQL is expanding to various technologies like Kafka, Clickhouse, Elasticsearch, and more.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Unit Testing for Data Engineers.

Data Engineering Central • 216 implied HN points • 13 Feb 23

🕹 Technology Data Engineering Unit Testing SQL Code Refactoring

Data Engineers often struggle with implementing unit tests due to factors like focus on moving fast and historical lack of emphasis on testing.
Unit testable code in data engineering involves keeping functions small, minimizing side effects, and ensuring reusability.
Implementing unit tests can elevate a data team's performance and lead to better software quality and bug control.

1BRC: Who's the Fastest to Process a Billion Java Records? - JVM Weekly vol. 67

JVM Weekly • 98 implied HN points • 11 Jan 24

🕹 Technology Software Development Programming Java Web Development SQL

The One Billion Rows Challenge in Java tests processing large data sets
Phoenix Template Engine simplifies backend-generated HTML with Java in Spring projects
Instancio 4.0 automates test data object creation for unit tests

Why Minimal Modeling has no 3-way links

Minimal Modeling • 101 implied HN points • 11 Jul 23

🕹 Technology Data Modeling Database Design Normalization Data Analysis SQL

In minimal modeling, links are defined with two anchors, not three.
Using two-way links can model examples effectively without the need for 3-way links.
Links in minimal modeling can introduce confusion when sentences in natural language aren't validated as actual links.

Generating SQL with LLMs for fun and profit

I Am Not a Robot • 69 HN points • 22 Jun 23

🕹 Technology AI SQL Programming Machine Learning

Language models can generate SQL queries, but can also create malicious queries if not careful.
Running infinite loops or allowing data exfiltration are risks with generated SQL queries.
Consider restricting permissions, making the database read-only, and avoiding prompt injection to reduce SQL injection risks with language models.

The hottest SQL tools you have no use for

The Orchestra Data Leadership Newsletter • 19 implied HN points • 16 Nov 23

🕹 Technology Data Engineering SQL Data Tools

SQL is a powerful data manipulation tool that has different dialects and evolved over time to fit various database software needs.
New SQL tools like dbt, SQLMesh, and Semantic Data Fabric aim to improve data testing, quality, and governance in data engineering processes.
The value in data engineering lies more in processes, culture, and diligence, rather than solely relying on fancy tools to prevent mistakes.

Improve your SQL skills X2 in 5 minutes

Leading Developers • 3 HN points • 13 Feb 24

🕹 Technology Data Management SQL Data Analysis Programming

SQL skills are crucial for managers because they can help answer business questions, understand technical designs, and provide a huge return on effort invested.
Don't stop with just learning joins in SQL. Advancing to using CTEs, window functions, and partitions can greatly enhance your ability to write complex queries.
Window functions in SQL, such as ranking functions, aggregation functions, and positional functions, can help in advanced query writing by allowing calculations across sets of rows or returning a single value from a specific row within partitions.

Many explanations of JOIN are wrong, and people get confused

Minimal Modeling • 1 HN point • 25 Nov 23

🕹 Technology Databases SQL Data Modeling Algorithm Programming

Many common explanations of JOIN are incorrect and can lead to confusion.
The common explanations are mostly right only when using specific conditions like ID equality.
The generalized behavior of JOIN should have been a separate operator to avoid confusion and optimize performance.

Build data apps with markdown and SQL

ingest this! • 1 HN point • 19 Feb 24

🕹 Technology Data Engineering Open Source Knowledge Graphs SQL

Build data apps using markdown and SQL with Evidence framework, offering a way to create polished data products.
Explore the future synergy of knowledge graphs and large language models (LLMs) for enhanced technologies.
Engage with the latest in data engineering by checking out a full exploration of the open-source data engineering landscape for 2024.

Querying a Semantic Data Model

Making Things • 0 implied HN points • 13 Nov 23

🕹 Technology Data Modeling Querying SQL

A semantic data model includes pre-built calculations and relationships.
There are two main types of queries: lookup and aggregating.
In a semantic data model, querying involves selecting dimensions and measures, simplifying the process.

Writing unit tests for SQL queries

Reflective Software Engineering • 0 implied HN points • 12 Jan 24

🕹 Technology SQL Unit Testing Refactoring Bug Fixing Data Engineering

Having unit tests for SQL queries can help catch bugs introduced during code refactorings or changes.
When writing unit tests for SQL queries, focus on testing the specific parts responsible for building the query rather than the entire method.
Refactoring code for testability can involve moving pure functions outside of the class for easier testing and simplifying methods to focus on specific tasks.

PostgreSQL Sort estimation instability

Conserving CPU's cycles ... • 0 implied HN points • 26 Jun 24

🕹 Technology Databases Optimization SQL

Incremental sort was added in PostgreSQL 2020 to enhance sorting strategies and improve efficiency in handling large datasets and analytical queries.
Estimation instability in PostgreSQL's sort operations can lead to unexpected query plans and performance differences, emphasizing the importance of careful estimation.
The vulnerability in PostgreSQL's optimizer code showcases how the choice of expression evaluation can impact query performance, highlighting a need for optimization improvements.

Malloy's 10x

Making Things • 0 implied HN points • 23 Nov 23

🕹 Technology Software Data Efficiency SQL

If you can make something 10x more efficient, you have a winner.
Malloy aims to replace SQL for asking questions of data.
Malloy's efficiency shines when multiple queries are involved, offering reusability and speed.

Mastering Window Functions: Your Gateway to Advanced SQL Analytics

DataSketch’s Substack • 0 implied HN points • 07 Oct 24

🕹 Technology Data Engineering SQL Analytics Software Development Database Management

Window functions let you do calculations across rows related to your current row without losing any details. This helps you get both summarized and detailed data at the same time.
Using window functions can make complex data tasks easier, like ranking items or finding running totals. They are very helpful in fields like healthcare to analyze patient data and improve efficiency.
It's important to test how window functions perform on a smaller dataset before using them widely. Combining multiple window functions and partitioning your data smartly can also boost performance.

Understanding your Big Data problem: Transforming Big Data with Incremental Models

The Orchestra Data Leadership Newsletter • 0 implied HN points • 31 Oct 23

🕹 Technology Data Engineering Data Modeling SQL Testing

Understanding the importance of incremental models for managing big data is crucial to efficiently running complex queries and maintaining data quality.
Design patterns in data modeling, such as Star Schema and Data Vault, play a significant role in how dbt models are structured and managed.
Using Jinja templating and implementing continuous data integration processes are key elements in handling big models effectively and ensuring data reliability.