The hottest Stream Processing Substack posts right now

And their main takeaways
Category
Top Technology Topics
SwirlAI Newsletter 255 implied HN points 07 May 23
  1. Watermarks in Stream Processing help handle event lateness and decide when to treat data as 'late data'.
  2. In SQL Query execution, the order is FROM and JOIN, WHERE, GROUP BY, HAVING, SELECT, ORDER BY, LIMIT.
  3. To optimize SQL Queries, reduce dataset sizes for joins and use subqueries for pre-filtering.
Bytewax 19 implied HN points 19 Dec 23
  1. One common use case for stream processing is transforming data into a format for different systems or needs.
  2. Bytewax is a Python stream processing framework that allows real-time data processing and customization.
  3. Bytewax enables creating custom connectors for data sources and sinks, making it versatile for various data processing tasks.
Data People Etc. 106 implied HN points 03 Apr 23
  1. Event-driven orchestrators are not suitable for stream processing because they cannot handle tasks with definite starts and ends.
  2. Event-driven applications operate asynchronously by triggering tasks based on events like files appearing in a directory.
  3. Unlike stream processors, orchestrators like Airflow and Dagster do not have the ability to hold state, distribute tasks for parallel execution, or shuffle data between tasks.
Software Snack Bites 50 implied HN points 28 Jun 23
  1. Memphis provides a better developer experience for stream processing.
  2. Memphis is designed for quick setup, cost efficiency, and user-friendly monitoring.
  3. Memphis is a platform of choice for companies looking to replace or enhance their streaming platforms.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Tributary Data 0 implied HN points 03 Jan 23
  1. Operational use cases with Kafka and Flink are crucial for business operations due to their message ordering, low latency, and exactly-once delivery guarantees.
  2. Using polyglot persistency with different data stores for read and write purposes can help solve the mismatch between write and read paths in microservices data management.
  3. Implementing a backend rate limiter using Flink as a Kafka consumer can help prevent exhausting an external system (e.g., a database) due to high message arrival rates from Kafka.