Engineering At Scale

Engineering At Scale is a weekly column focused on elucidating complex engineering concepts like databases, system design, architecture, and guiding through engineering careers in accessible content. It covers topics from API gateways to database sharding, scalability, cloud applications development, vector databases, and practical advice for mastering system design interviews.

Databases System Design Software Architecture Engineering Careers Microservices Cloud Computing Distributed Computing API Design Scalability Performance Optimization

The hottest Substack posts of Engineering At Scale

And their main takeaways
72 implied HN points 11 Feb 24
  1. API Gateway acts as an intermediary in microservices, handling client requests, and routing them to the appropriate microservices, simplifying communication for clients.
  2. API Gateway enhances security by authenticating and authorizing requests, provides rate-limiting to prevent attacks, and improves performance through caching and protocol conversion.
  3. Downsides of API Gateways include increased latency due to an extra hop, potential single point of failure, and added complexity to the system architecture.
4 HN points 03 Mar 24
  1. Uber developed CacheFront, an integrated caching solution to overcome problems like maintenance overhead, reduced developer productivity, and region failovers caused by using Redis for caching
  2. Docstore's architecture includes a Control plane, Query Engine, and Storage Engine, with relevant responsibilities for each layer like query execution, data persistence, transaction management, and more
  3. CacheFront's design addressed non-functional requirements like consistency guarantees, cache warming & region failovers, fault tolerance, hot partition issues, and performance & cost improvements
29 implied HN points 29 Jul 23
  1. Database sharding splits a large dataset into chunks stored on different machines, increasing storage capacity and distributing queries for better performance.
  2. Sharding allows for high availability by avoiding a single point of failure and higher read/write throughput by distributing query load.
  3. Cost and maintenance overhead are drawbacks of sharding, and it differs from partitioning where data is stored on a single machine.
3 HN points 26 Jan 24
  1. Microservices offer advantages like scalability and fault-tolerance, but come with challenges like increased latency and management overhead.
  2. A proposed solution suggests writing monolith applications, leveraging runtime for deployments, and implementing atomic rollouts to address microservices challenges.
  3. By modularizing code into components, abstracting communication details, and managing deployment lifecycles, the solution aims to improve performance and reduce costs.
14 implied HN points 24 Jun 23
  1. PostgreSQL currently uses a process-based model for handling client connections and managing data.
  2. The process-based model offers advantages like fault isolation, security guarantees, and efficient resource management.
  3. Although there are advantages to the process-based model, the community is considering a switch to a thread-based model for PostgreSQL in the future.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
2 HN points 15 Jan 24
  1. Load Balancers distribute client requests to different servers, improving system reliability and scalability.
  2. Load Balancers handle growing internet usage by evenly distributing workloads, preventing servers from being overwhelmed.
  3. Different types of Load Balancers include Hardware, Software, and Cloud Load Balancers, each with unique benefits for system optimization.
3 HN points 15 Jul 23
  1. Vector databases are trending in the tech industry, especially with AI applications and investments from various sources.
  2. Data can be classified into structured, semi-structured, and unstructured categories, each requiring different database solutions.
  3. Vector databases excel in handling unstructured data, like images and videos, providing specialized search capabilities for applications like recommendation systems and fraud detection.
2 HN points 05 Aug 23
  1. Range-Based Sharding divides data based on ranges like organizing books in bookshelves to make searches easier.
  2. Hash-Based Sharding evenly distributes data across different shards using a hash function, but may require data rebalancing when the number of shards changes.
  3. Consistent Hashing minimizes data movement when adding or removing shards, improving scalability while Geo-Based Sharding stores data close to users for better performance.
0 implied HN points 10 Feb 23
  1. APIs are interfaces that accept inputs and produce outputs.
  2. APIs are the building blocks of websites and allow communication between clients and servers.
  3. Real-world examples of API usage include Google Maps, Twitter, Stripe, and cloud APIs.
0 implied HN points 10 Jun 23
  1. Scalability is crucial for software systems to handle increasing demand and data.
  2. Building scalable systems can involve horizontal scaling (adding more machines) or vertical scaling (adding more resources to the same machine).
  3. Cloud technologies, like auto-scaling and managed databases, offer solutions for building scalable systems.
0 implied HN points 04 Jul 23
  1. Building reliable systems in an unreliable world is crucial for the success of products and services.
  2. Failures in distributed systems can lead to challenges like duplicate transactions, but idempotent APIs can help ensure consistency.
  3. Idempotent APIs are key in guaranteeing data integrity, simplifying error handling, and enhancing fault tolerance in distributed systems.
0 implied HN points 10 Feb 23
  1. An announcement about a new subscriber chat space in the Substack app
  2. To join the chat, download the Substack app available for iOS and Android
  3. Start by clicking the link to download the app and then open it to access the chat feature