The hottest Data Structures Substack posts right now

And their main takeaways
Category
Top Technology Topics
Confessions of a Code Addict 1683 implied HN points 12 Jan 25
  1. Unix engineers faced a big challenge in fitting a large dictionary into just 64kB of RAM. They came up with clever ways to compress the data and use efficient structures to make everything fit.
  2. A key part of their solution was the Bloom filter, which helped quickly check if words were in the dictionary without needing to look up every single word, saving time.
  3. They also used innovative coding methods to further reduce the size of the data needed for the dictionary, allowing for fast lookups while staying within the strict memory limits of their hardware.
The Python Coding Stack • by Stephen Gruppetta 259 implied HN points 13 Oct 24
  1. In Python, lists don't actually hold the items themselves but instead hold references to those items. This means you can change what is in a list without changing the list itself.
  2. If you create a list by multiplying an existing list, all the elements will reference the same object instead of creating separate objects. This can lead to unexpected results, like altering one element affecting all the others.
  3. When dealing with immutable items, such as strings, it doesn't matter if references point to the same object. Since immutable objects cannot be changed, there are no issues with such references.
Minimal Modeling 608 implied HN points 05 Dec 24
  1. Fourth Normal Form (4NF) is mainly about creating simple two-column tables to link related data, like teachers and their skills. This straightforward design is often overlooked in favor of complex definitions.
  2. Many explanations of 4NF start with confusing three-column tables and then break them down into simpler forms. This approach makes it harder for learners to grasp the concept quickly and effectively.
  3. The term 'multivalued dependency' can be simplified to just mean a list of unique IDs. You don’t really need to focus on this term to design good database tables; it's more of a historical detail.
Confessions of a Code Addict 529 implied HN points 09 Nov 24
  1. In Python, you can check if a list is empty by using 'if not mylist' instead of 'if len(mylist) == 0'. This way is faster and is more widely accepted as the Pythonic approach.
  2. Some people find the truthiness method confusing, but it often boils down to bad coding practices, like unclear variable names. Keeping your code clean and well-named can make this style clearer and more readable.
  3. Using 'len()' to check for emptiness isn't wrong, but you should choose based on your situation. The main point is that the Pythonic method isn't ambiguous; it just needs proper context and quality coding.
Push to Prod 59 implied HN points 13 Aug 24
  1. When a system gets slow, it’s often because of queues. Queues help manage requests but can create delays if not handled properly.
  2. Different types of queues can slow down your system, like thread pools, connection pools, and TCP queues. Keeping these optimized can improve performance.
  3. Using thread dumps can help identify problems in your system. They can show which threads are blocked and help you fix the slowdowns.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
System Design Classroom 239 implied HN points 24 May 24
  1. Hashmaps are useful for storing data by connecting unique keys to their values, making it easy to find and retrieve information quickly.
  2. When two different keys accidentally produce the same hash code, it's called a collision. There are ways to handle this, like chaining and open addressing.
  3. Hashmaps can do lookups, insertions, and deletions really fast, usually in constant time, but they can slow down if too many items cause collisions.
Technology Made Simple 639 implied HN points 01 Jan 24
  1. Graphs are efficient at encoding and representing relationships between entities, making them useful for fraud detection tasks.
  2. Graph Neural Networks excel at fraud detection due to their ability to visualize strong correlations among fraudulent activities that share common properties, adapt to new fraud patterns, and offer transparency in AI systems.
  3. Graph Neural Networks require less labeled data and feature engineering compared to other techniques, have better explainability, and work well with semi-supervised learning, making them a powerful tool for fraud detection.
Technology Made Simple 279 implied HN points 28 Feb 24
  1. The sliding window technique is a powerful algorithmic model used for problem-solving in coding interviews and software engineering, offering efficiency and practicality.
  2. Benefits of using the sliding window technique include reducing duplicate work, maintaining consistent linear time complexity, and its utility in AI feature extraction processes.
  3. Spotting the sliding window technique involves identifying keywords like maximum, minimum, longest, or shortest, dealing with continuous elements, and converting brute-force approaches into efficient solutions.
Permit.io’s Substack 59 implied HN points 23 May 24
  1. JWTs are great for authentication but should be used carefully. They are not meant for detailed permission checks and can create security issues if misused.
  2. They are static once issued, meaning any changes to a user's role won't be reflected until the token expires. This can lead to potential security risks.
  3. JWTs are suitable for stateless, distributed systems and coarse-grained authorization, but for fine-grained control, other tools should be used.
Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots 39 implied HN points 17 Jun 24
  1. LangGraph helps create clearer conversations by using graphs to map out how dialog flows between different points, making it easier to manage conversations in AI systems.
  2. Prompt chaining connects smaller tasks in a sequence, allowing AI models to handle complex jobs step by step, but can feel rigid like traditional chatbots.
  3. Autonomous Agents bring a higher level of flexibility in how actions are taken, but they can also lead to concerns about having enough control over their decision-making process.
Low Latency Trading Insights 117 implied HN points 11 Feb 24
  1. The requirements for a rate-limiting algorithm include precise event counting, fast performance especially during market turbulence, and minimal impact on cache memory.
  2. Creating a rate-limiting algorithm using a multimap for counting events has inefficiencies; a better solution involves enhancements for optimal performance.
  3. A bounded approximation approach for rate limiting achieves memory efficiency by assuming a minimum time precision and implementing a clever advance-and-clear mechanism.
Technology Made Simple 299 implied HN points 22 Jan 23
  1. Understanding Data Structures and Algorithms is crucial for success in technical fields like software development.
  2. Many resources focus on DSA for coding interviews, but it's important to go beyond that to deepen your knowledge.
  3. Learning DSA effectively doesn't have to involve answering countless questions or watching numerous tutorials; there are better approaches available.
Technology Made Simple 179 implied HN points 18 Jul 23
  1. Trees are powerful data structures that are great for efficient organization and retrieval of data in software engineering.
  2. Recursion works well with trees due to their recursive substructure, making implementation of recursive functions easier.
  3. Decision trees in AI excel at discerning complex patterns, providing interpretable results, and are versatile in various domains such as finance, healthcare, and marketing.
Data Engineering Central 157 implied HN points 13 Mar 23
  1. Understanding Data Structures and Algorithms is important for becoming a better engineer, even if you may not use them daily.
  2. Linked Lists are a linear data structure where elements are not stored contiguously in memory but are linked using pointers.
  3. Creating a simple Linked List in Rust involves defining nodes with values and pointers to other nodes, creating a LinkedList to hold these nodes, and then linking them to form a chain.
Technology Made Simple 99 implied HN points 21 Nov 23
  1. Stacks are powerful data structures in software engineering and can be modified extensively to suit different use cases.
  2. Implementing Stacks using a Singly Linked List can be beneficial for dynamic resizing, though Arrays are often preferred due to memory considerations.
  3. Exploring variations like Persistent Stacks, Limiting Stack Size, Ensuring Type Safety, Thread Safety, Tracking Min/Max, and Undo Operations can enhance the functionality and efficiency of Stacks in various scenarios.
Hung's Notes 79 implied HN points 13 Dec 23
  1. Global Incremental IDs are important for preventing ID collisions in distributed systems, especially during tasks like data backup and event ordering.
  2. UUID and Snowflake ID are two common types of global IDs, each with unique advantages and disadvantages. For instance, UUIDs are larger but widely used, while Snowflake IDs are smaller but more complex to generate.
  3. Different systems, like Sonyflake and Tinyid, offer specialized methods for generating IDs, helping to ensure performance and avoiding database bottlenecks.
Technology Made Simple 99 implied HN points 04 May 23
  1. The post discusses Problem 85: Count Complete Tree Nodes [Amazon], focusing on recursion, trees, and data structures.
  2. It is about solving a problem related to counting the number of nodes in a complete binary tree efficiently.
  3. The post mentions the importance of community engagement in choosing problems to discuss and the growth of the author's newsletter.
Technology Made Simple 59 implied HN points 05 May 23
  1. The post discusses a problem related to counting the number of nodes in a complete binary tree, emphasizing the importance of understanding recursion, trees, and data structures.
  2. It mentions starting with a brute force solution to count nodes but highlights the need for optimization to achieve time complexity better than O(n).
  3. The approach for solving the problem involves using a recursive template to count nodes efficiently by considering the root and the number of nodes in the left and right subtrees.
Technology Made Simple 39 implied HN points 27 Jan 23
  1. The problem discussed is about validating a binary search tree, ensuring the left subtree contains smaller values, the right subtree contains greater values, and both are valid binary search trees.
  2. Examples are provided to illustrate the concept, showing a valid and an invalid binary search tree.
  3. Constraints include the number of nodes and the value ranges in the tree.
Technology Made Simple 59 implied HN points 28 Sep 22
  1. Using sentinel nodes in Doubly Linked Lists can improve performance and make code easier to read and implement
  2. Implementing sentinel nodes removes special cases in DLL implementations, simplifies code, and makes it more provably correct
  3. Although using sentinel nodes may require some extra memory, the simplification it brings to the code is often worth the tradeoff
Confessions of a Code Addict 46 HN points 14 Sep 23
  1. Python uses Bloom filters in its string data structure to speed up certain string processing functions like strip and splitlines.
  2. The unique Bloom filter implementation in CPython uses an unsigned long type to represent the bit vector, making storing and querying items more efficient.
  3. CPython determines the position in the bit vector for adding and querying characters by using the lower n-bits of the character, avoiding costly hash computations.
Technology Made Simple 79 implied HN points 06 Apr 22
  1. Experts often give bad advice for studying Data Structures and Algorithms, like relying solely on Leetcode.
  2. To effectively learn DSA, take time to understand the history and purpose of each data structure beyond just learning the mechanics.
  3. Don't rush through learning Data Structures and Algorithms; taking it slow and grasping the fundamentals thoroughly will lead to better mastery and understanding.
Technology Made Simple 79 implied HN points 30 Mar 22
  1. BFS and DFS algorithms are foundational and crucial for various graph traversal problems, forming the basis for more complicated algorithms.
  2. Topological Sort, Djikstra's Algorithm, and A* are important graph traversal algorithms to master, especially for weighted graphs and AI applications like self-driving cars.
  3. For determining the correct graph traversal algorithm, identify if you need to find the shortest path (use BFS or A* for unweighted/weighted graphs), or if you need to visit the complete graph (use DFS for problems involving the entire graph).
Technology Made Simple 59 implied HN points 10 Jul 22
  1. Bloom Filters are probabilistic data structures used to efficiently test for membership.
  2. Bloom Filters work by having a bit array of size m with k hash functions mapping values to indices, setting the indices to 1 for a given input.
  3. Bloom Filters are great for reducing unnecessary disk access, but they can result in false positives and need regeneration as more values are added.
Technology Made Simple 39 implied HN points 30 Sep 22
  1. The problem focuses on designing a class to find the kth largest element in a stream, emphasizing it's the kth largest in sorted order, not distinct element.
  2. The implementation includes initializing the class with k and a set of numbers, then appending values to the stream to return the kth largest element.
  3. The constraints for the problem include specific limitations on the range of values and number of calls that can be made.
Technology Made Simple 59 implied HN points 29 Mar 22
  1. Graphs can be seen from various perspectives: charts and plots (stats), maps with complex algorithms (graph theory), and adjacency lists for coding. Understanding these perspectives is crucial for effective use of graphs.
  2. Identifying whether a problem could be a graph problem involves recognizing the entities (nodes), relationships (edges), and weights in the context of a system. This spotting framework helps in solving graph-related problems efficiently.
  3. Practicing graph spotting as a skill involves starting with easy problems to identify graph components quickly. Familiarity with graphs and the ability to spot them easily is crucial for solving graph problems in interviews.
Technology Made Simple 39 implied HN points 02 Aug 22
  1. In graph traversal, reducing memory usage by marking spots as visited instead of using a set can optimize your code and help you move from O(n) space complexity to O(1) complexity.
  2. This technique is straightforward to implement, takes no extra space, and can be a significant improvement in graph traversal algorithms.
  3. When implementing this technique, be cautious about the value used to mark visited cells and always confirm with your interviewer about input data type to avoid conflicts.
The ZenMode 15 HN points 10 Feb 24
  1. Caching like Redis stores frequently used data for faster retrieval, improving response times, reducing database load, and leading to cost-effectiveness in running high-traffic applications.
  2. Redis is fast due to in-memory storage, optimized data structures, reduced I/O operations, single-threaded architecture, and event-driven design, but has limitations like limited capacity and issues with data persistence.
  3. Choosing the right caching system, like Redis, requires considering factors like data size, access patterns, consistency requirements, and fault tolerance for high availability and durability.
Technology Made Simple 39 implied HN points 11 Jun 22
  1. Creating a data structure with O(1) time complexity involves implementing functions like plus, minus, get_max, and get_min efficiently.
  2. Utilizing a Doubly Linked List allows for maintaining a sorted collection of keys, enabling quick access to elements with the lowest and highest values.
  3. Developing algorithms to handle key count increments and decrements while preserving the sorted order of the linked list is crucial for a functional solution.
Technology Made Simple 39 implied HN points 01 Jun 22
  1. Using bit fields can significantly reduce storage requirements for tracking user preferences. A technique by Vimeo engineers allowed O(1) space compared to the traditional O(k) method, making the solution far more efficient.
  2. Bit fields utilize bitwise operators to represent content filters, enabling quick comparisons in constant time. This approach is memory and time-efficient.
  3. Implementing bit fields for tracking user preferences allows for efficient filtering of content by performing a bitwise AND operation between the user's and video's bit fields. This results in a quick eligibility check for the user.
Web Dev Explorer 3 HN points 29 Apr 24
  1. Data stored on the stack is static, fixed in size, with a fixed lifecycle, and cannot be referenced across different stack frames.
  2. Data stored on the heap is dynamic, not fixed in size, has a flexible lifecycle, and can be referenced across different stack frames.
  3. Various programming languages use different memory management approaches, like manual management in C, garbage collection in Java, ARC in Objective-C and Swift, and ownership mechanism in Rust.
Freelance Footprints 8 HN points 20 Feb 24
  1. The leaky bucket algorithm helps manage the rate of requests a web application can handle. It uses the idea of a bucket that can fill up and overflow if too many requests come in at once.
  2. In this algorithm, there are two key settings: the maximum number of requests allowed at a time and the rate at which requests are processed. This controls how quickly requests are dealt with and prevents overload.
  3. The leaky bucket algorithm is widely used in tech, such as by companies like SeatGeek for their waiting room systems, to ensure smooth user experiences without exceeding server limits.