The hottest Cloud Computing Substack posts right now

And their main takeaways
Category
Top Technology Topics
Blog System/5 • 744 implied HN points • 24 Nov 25
  1. Bazel is getting better with mandatory features like bzlmod and a real BUILD Foundation to support its community. This means it's growing up and easier to use.
  2. The Bazel team is really focused on making builds faster and more efficient, with cool new tools like Skycache for speeding things up on the client side.
  3. Community-driven tools are expanding Bazel's reach, solving old problems. For example, Aspect's task runner helps fill in gaps and improve work processes.
VuTrinh. • 339 implied HN points • 23 Jul 24
  1. AWS offers a variety of tools for data engineering like S3, Lambda, and Step Functions, which can help anyone build scalable projects. These tools are often underused compared to newer options but are still very effective.
  2. Services like SNS and SQS can help manage data flow and processing. SNS allows for publishing messages while SQS aids in handling high event volumes asynchronously.
  3. Using AWS for data engineering is often simpler than switching to modern tools. It's easier to add new AWS services to your existing workflow than to migrate to something completely new.
Brad DeLong's Grasping Reality • 453 implied HN points • 05 Dec 25
  1. The AI boom probably won’t deliver a superintelligent AGI, but it will leave a lot of useful infrastructure, open models, and tools that improve weather forecasting, drug discovery, copilots, and other practical applications.
  2. Proprietary LLM businesses face high operating costs, thin moats, and fast commoditization, while big platforms are mainly spending to defend existing monopolies, so much innovation will diffuse rather than create new dominant platforms.
  3. If AI capex is financed mostly with equity a crash would look more like the dot‑com bust and leave stranded but reusable assets; watch signals like falling GPU prices, datacenter subleases, and free copilot bundles, and plan policies to repurpose assets and limit attention‑harvesting harms.
Frankly Speaking • 50 implied HN points • 12 Feb 26
  1. Google could become a major security player by consolidating essential "plumbing" tools like SSO, EDR, and email into a neutral infrastructure layer, with Wiz providing visibility and Gemini automating workflows. This would let builders customize and remediate problems instead of battling closed, admin-focused tools.
  2. AI is collapsing the per-seat SaaS and point-product model; security must scale with code, agents, and automation rather than more headcount. Organizations that automate extensively shorten breach lifecycles and lower costs.
  3. Google’s vertical integration—cloud, Workspace, and a powerful AI model—plus usage-based pricing and targeted acquisitions could make it a builder-friendly alternative to legacy security vendors. That positioning plays to engineers who want API-first, customizable infrastructure rather than proprietary, admin-heavy systems.
Interconnected • 4751 implied HN points • 13 Jan 25
  1. Chinese AI models can answer sensitive questions when run locally, but they often censor answers in cloud settings. This shows a difference in behavior based on where the models are hosted.
  2. Censorship in AI models is more about the cloud platforms than the models themselves. This poses challenges for Chinese cloud providers wanting to compete internationally.
  3. Even though some see Chinese AI as censored, it can still be powerful and competitive. Users may prefer to download and run these models locally to avoid censorship and make the most of their capabilities.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Resilient Cyber • 99 implied HN points • 20 Aug 24
  1. Application Detection & Response (ADR) is becoming important because attackers are increasingly targeting application vulnerabilities. This shift means we need better tools that focus specifically on applications.
  2. Modern software systems are complex, making it hard for traditional security tools to catch real threats. That's why understanding how these systems interact can help identify harmful behavior more effectively.
  3. There’s a big push to find and fix security issues early in the development process. However, this focus on early detection often misses what's actually happening in real-life applications, making runtime security like ADR crucial.
Big Technology • 5129 implied HN points • 03 Dec 24
  1. Amazon is focusing heavily on AI and has introduced new AI chips, reasoning tools, and a large AI training cluster to enhance their cloud services. They want customers to have more options and better performance for their AI needs.
  2. AWS believes in providing choices to customers instead of pushing one single solution. They aim to support various AI models for different use cases, which gives developers flexibility in how they build their applications.
  3. For energy solutions, Amazon is investing in nuclear energy. They see it as a clean and important part of the future energy mix, especially as demand for energy continues to grow.
VuTrinh. • 199 implied HN points • 20 Jul 24
  1. Kafka producers are responsible for sending messages to servers. They prepare the messages, choose where to send them, and then actually send them to the Kafka brokers.
  2. There are different ways to send messages: fire-and-forget, synchronous, and asynchronous. Each method has its pros and cons, depending on whether you want speed or reliability.
  3. Producers can control message acknowledgment with the 'acks' parameter to determine when a message is considered successfully sent. This parameter affects data safety, with options that range from no acknowledgment to full confirmation from all replicas.
Practical Data Engineering Substack • 79 implied HN points • 18 Aug 24
  1. The evolution of open table formats has improved how we manage data by introducing log-oriented designs. These designs help us keep track of data changes and make data management more efficient.
  2. Modern open table formats like Apache Hudi and Delta Lake offer database-like features on data lakes, ensuring data integrity and allowing for easier updates and querying.
  3. New projects are working on creating a unified table format that can work with different technologies. This means that in the future, switching between data formats could be simpler and more streamlined.
VuTrinh. • 219 implied HN points • 02 Jul 24
  1. PayPal operates a massive Kafka system with over 85 clusters and handles around 1.3 trillion messages daily. They manage data growth by using multiple geographical data centers for efficiency.
  2. To improve user experience and security, PayPal developed tools like the Kafka Config Service for easier broker management and added access control lists to restrict who can connect to their Kafka clusters.
  3. PayPal focuses on automation and monitoring, implementing systems to quickly patch vulnerabilities and manage topics, while also optimizing metrics to quickly identify issues with their Kafka platform.
VuTrinh. • 319 implied HN points • 08 Jun 24
  1. LinkedIn processes around 4 trillion events every day, using Apache Beam to unify their streaming and batch data processing. This helps them run pipelines more efficiently and save development time.
  2. By switching to Apache Beam, LinkedIn significantly improved their performance metrics. For example, one pipeline's processing time went from over 7 hours to just 25 minutes.
  3. Their anti-abuse systems became much faster with Beam, reducing the time taken to identify abusive actions from a day to just 5 minutes. This increase in efficiency greatly enhances user safety and experience.
Bite code! • 1223 implied HN points • 06 Jul 25
  1. Emscripten support is now official, which makes it easier to run Python in web browsers. This means you can execute Python code without needing a server.
  2. Mypy has released a new version that fixes some annoying issues and allows more flexible coding styles. Now you can redefine variables more easily without strict type checks.
  3. FastAPI's creator has started a new company to make it simpler to deploy FastAPI projects. This service aims to streamline the deployment process with just one command.
Brad DeLong's Grasping Reality • 169 implied HN points • 18 Dec 25
  1. Big tech is building lots of AI infrastructure not because it’s betting the farm on core AI products, but to capture the rents from the AI boom by selling infrastructure and services.
  2. The AI labs are the ones digging for breakthrough models and customer demand, but core AI products may have low margins and fickle users, so those businesses carry higher risk of a bust.
  3. Cloud and platform companies often commoditize or give away core AI tools to protect their high‑margin businesses, and investors are increasingly valuing firms based on real cash generation rather than AI hype.
VuTrinh. • 119 implied HN points • 27 Jul 24
  1. Kafka uses a pull model for consumers, allowing them to control the message retrieval rate. This helps consumers manage workloads without being overwhelmed.
  2. Consumer groups in Kafka let multiple consumers share the load of reading from topics, but each partition is only read by one consumer at a time for efficient processing.
  3. Kafka handles rebalancing when consumers join or leave a group. This can be done eagerly, stopping all consumers, or cooperatively, allowing ongoing consumption from unaffected partitions.
VuTrinh. • 339 implied HN points • 25 May 24
  1. Twitter processes an incredible 400 billion events daily, using a mix of technologies for handling large data flows. They built special tools to ensure they can keep up with all this information in real-time.
  2. After facing challenges with their old setup, Twitter switched to a new architecture that simplified operations. This new system allows them to handle data much faster and more efficiently.
  3. With the new system, Twitter achieved lower latency and fewer errors in data processing. This means they can get more accurate results and better manage their resources than before.
Alex's Personal Blog • 98 implied HN points • 13 Jan 26
  1. Apple picking Google to power its AI features concentrates distribution and AI-provider power, making it harder for smaller rivals to compete and raising antitrust concerns.
  2. Politicians are blaming data-center energy use for rising utility costs, and Microsoft is promising to reduce consumer impacts by funding infrastructure, paying full local taxes, and training local workers.
  3. Anthropic’s Claude Cowork moves AI from developer tools toward a personal, persistent assistant, but it’s very compute-heavy and currently limited to expensive plans until more capacity is brought online.
Alex's Personal Blog • 197 implied HN points • 08 Dec 25
  1. A global payments startup restructured its investor base and is pushing into the U.S. to counter worries about Chinese ties, but it’s still unclear if that will calm regulators or customers.
  2. IBM bought Confluent to get closer to enterprise data streams and strengthen its AI and automation offerings, a strategic play that boosts growth without changing IBM’s scale much.
  3. OpenAI is leaning into the B2B market with rapid growth in enterprise seats and claims that its tools save workers substantial time, showing strong corporate demand even as consumer monetization lags.
Cloud Irregular • 2661 implied HN points • 10 Dec 24
  1. At this year's AWS re:Invent, there were no major new services launched, which is quite different from previous years. Instead, AWS focused on enhancing existing services and features.
  2. In the past, AWS released many new services, but many of them didn't succeed. This led to dissatisfaction within the developer community.
  3. Now, AWS seems to be concentrating on improving their core offerings. This change could help revive interest and excitement in the AWS developer community again.
Bite code! • 10520 implied HN points • 24 Jun 23
  1. XML was once believed to be the future, but turned out to create technical debt instead.
  2. Following every hype blindly in technology can lead to failed projects and waste of money.
  3. Using the right tool for the right job is crucial in software development, avoiding unnecessary complexity and costs.
VuTrinh. • 139 implied HN points • 09 Jul 24
  1. Uber recently introduced Kafka Tiered Storage, which allows storage and compute resources to work separately. This means you can add storage without needing to upgrade processing power.
  2. The tiered storage system has two parts: local storage for fast access and remote storage for long-term data. This setup helps manage data efficiently and keeps the local storage less cluttered.
  3. When you need older data, it can be accessed directly from the remote storage, allowing faster performance for applications that need quick access to recent messages.
Generating Conversation • 163 implied HN points • 11 Dec 25
  1. AI is settling into a regular generational platform shift like cloud or mobile, so expect lots of change but not a sudden collapse of society. This means the broad fabric of daily life and institutions will largely persist even as AI reshapes industries.
  2. This is not a bear case—AI will create massive value and spawn new dominant companies, but it’s unlikely to be orders of magnitude bigger than past platform shifts. We already have plenty of capability today to build important, valuable products.
  3. Models will specialize to different human and enterprise preferences, so we’ll see many tailored models and apps rather than one universal breakthrough. That points to steady, incremental improvements and lots of product-level innovation over the next decade.
Cloud Irregular • 7244 implied HN points • 24 Oct 23
  1. DHH believes established companies that can amortize capital investments should reconsider the cloud
  2. Different types of companies require different approaches to cloud vs. data center
  3. Switching back from the cloud to data center may bring back old problems that cloud solutions had addressed
VuTrinh. • 159 implied HN points • 22 Jun 24
  1. Uber uses a Remote Shuffle Service (RSS) to handle large amounts of Spark shuffle data more efficiently. This means data is sent to a remote server instead of being saved on local disks during processing.
  2. By changing how data is transferred, the new system helps reduce failures and improve the lifespan of hardware. Now, servers can handle more jobs without crashing and SSDs last longer.
  3. RSS also streamlines the process for the reduce tasks, as they now only need to pull data from one server instead of multiple ones. This saves time and resources, making everything run smoother.
VuTrinh. • 259 implied HN points • 18 May 24
  1. Hadoop Distributed File System (HDFS) is great for managing large amounts of data across many servers. It ensures data is stored reliably and can be accessed quickly.
  2. HDFS uses a NameNode that keeps track of where data is stored and multiple DataNodes that hold actual data copies. This design helps with data management and availability.
  3. Replication is key in HDFS, as it keeps multiple copies of data across different nodes to prevent loss. This makes HDFS robust even if some servers fail.
Resilient Cyber • 39 implied HN points • 20 Aug 24
  1. Security tool sprawl is increasing in organizations, with many now using 70 to 90 different tools, making it harder to manage effectively.
  2. AI can speed up fixing coding vulnerabilities, but many AI-generated codes can be insecure, requiring careful checking by developers.
  3. Understanding systems and processes is key to tackling the complexities of cybersecurity, rather than blaming external forces for challenges in job applications.
SemiAnalysis • 6667 implied HN points • 02 Oct 23
  1. Amazon and Anthropic signed a significant deal, with Amazon investing in Anthropic, which could impact the future of AI infrastructure.
  2. Amazon has faced challenges in generative AI due to lack of direct access to data and issues with internal model development.
  3. The collaboration between Anthropic and Amazon could accelerate Anthropic's ability to build foundation models but also poses risks and challenges.
Clouded Judgement • 20 implied HN points • 20 Feb 26
  1. A global NAND/SSD shortage has emerged as AI demand has ballooned, driving big gains in memory-related stocks and creating a structural supply problem.
  2. AI has shifted from being compute-bound to data- and memory-bound. Inference, KV caches, and the flood of AI-generated artifacts need huge, low-latency memory and expose inefficiencies in legacy tiering and NAS data paths.
  3. The answer is efficiency, not just buying more flash: orchestrate data so local GPU NVMe can be used as fast Tier‑0, tier cold data to HDDs, recover stranded capacity, use hybrid cloud, and deduplicate across regions to cut flash demand.
Metacritic Capital • 13 implied HN points • 23 Feb 26
  1. Hyperscalers are three different businesses at once: Traditional IaaS (sticky, high‑margin cloud services), Token Factories (LLM inference APIs sold by token consumption), and AI mega‑deals (capex‑heavy long‑term GPU/data‑center contracts with labs).
  2. Token Factory work is commoditizing and price‑sensitive: customers can swap models or providers quickly, so serving costs and model access drive competitiveness more than platform lock‑in.
  3. AI mega‑deals change growth quality and valuation: hosting labs can boost revenue but often yields lower, fixed IRRs, so investors must model revenue, capex, and margins separately for each business and run a DCF.
The Social Juice • 29 implied HN points • 08 Feb 26
  1. Governments are ramping up regulation of social platforms and their recommendation engines. Some countries are even proposing bans for under-16s and opening investigations into AI tools.
  2. Big tech ad businesses are still making record money, with Google, YouTube, Amazon Ads and others reporting big revenue gains. At the same time companies are pouring huge sums into AI and facing slower user growth or rising costs.
  3. AI is rapidly reshaping advertising and product features, from AI-generated Super Bowl ads to agentic ad tools and chat assistants. That surge is creating new safety, legal and measurement headaches around deepfakes, moderation and publisher defenses.
Cloud Irregular • 4878 implied HN points • 03 Jan 24
  1. Leaving a familiar job for the unknown can be both challenging and exhilarating.
  2. At times, there may not be a clear, traditional career path to follow, and you might need to create your own unique journey.
  3. Prioritizing creating things that bring joy to people can drive your career decisions and future goals.
SemiAnalysis • 6263 implied HN points • 01 Sep 23
  1. Google's TPUv5e offers a cost advantage for training and inferring models with under 200 billion parameters compared to AI chips from other companies.
  2. TPUv5e and TPUv5 prioritize efficiency and low power consumption over peak performance, with a focus on minimizing total cost of ownership.
  3. Google's TPUv5e system features high bandwidth communication between chips, linear cost scaling, and efficient software tools for ease of use.
Hung's Notes • 79 implied HN points • 18 Jul 24
  1. Migrating authorization logic from an old system to a new one can take a long time and requires careful planning to avoid errors.
  2. Each part of a business can manage its own authorization rules, making it easier for them to control access based on their specific needs.
  3. As systems grow, it's important to keep improving and adapting to new challenges, like optimizing runtime decisions and better analyzing access logs.
The Lunduke Journal of Technology • 6893 implied HN points • 26 Apr 23
  1. Big tech companies are promoting the idea of using less capable computers and remote desktop-ing into central servers.
  2. Microsoft is pushing Windows 365 Frontline where users connect to a remote Windows 11 desktop provided by Microsoft.
  3. Google is providing low-power Chromebooks to employees and encouraging the use of Google Cloudtop for desktop software, eliminating the need for powerful computers.
Data Science Weekly Newsletter • 159 implied HN points • 31 May 24
  1. Mediocre machine learning can be very risky for businesses, as it may lead to significant financial losses. Companies need to ensure their ML products are reliable and efficient.
  2. Understanding logistic regression can be made easier by using predicted probabilities. This approach helps in clearly presenting data analysis results, especially to those who may not be familiar with technical terms.
  3. Data quality management is becoming essential in today's data-driven world. It's important to keep track of how data is tested and monitored to maintain trust and accuracy in business decisions.
Resilient Cyber • 119 implied HN points • 18 Jun 24
  1. The SEC's case against SolarWinds could change how Chief Information Security Officers are viewed in the industry, potentially discouraging talented people from taking on these roles.
  2. Organizations need to actively prepare for cyberattacks through tabletop exercises, which can help teams respond better during real security incidents.
  3. Microsoft's cybersecurity issues have raised concerns regarding national security, highlighting the need for stronger security practices and accountability in tech companies.
Resilient Cyber • 159 implied HN points • 28 May 24
  1. Non-Human Identities (NHIs) are the machine-based accounts used in businesses, often outnumbering human accounts significantly. They include things like service accounts and API keys, which are essential for modern tech operations.
  2. NHIs are a major security risk since they can have lots of permissions and are often left unmonitored. This makes them a target for hackers looking to exploit weak points in security systems.
  3. It’s important for companies to have strong governance around NHIs. Without proper controls, these machine identities can lead to security gaps and make it easier for attackers to gain access to systems.
Cloud Irregular • 3696 implied HN points • 22 Jan 24
  1. The cloud landscape is shifting from big hyperscalers to more specialized services like standalone databases and DIY cloud-in-a-box.
  2. Using tools like Nightshade to protect art from being exploited by AI may not be the best strategy, focusing on creating original, high-quality art is key.
  3. Google, despite criticism, remains a significant player in the tech industry, seen as a symbol of intellectual prowess and innovation.
VuTrinh. • 99 implied HN points • 25 Jun 24
  1. Uber is moving its huge amount of data to Google Cloud to keep up with its growth. They want a smooth transition that won't disrupt current users.
  2. They are using existing technologies to make sure the change is easy. This includes tools that will help keep data safe and accessible during the move.
  3. Managing costs is a big concern for Uber. They plan to track and control spending carefully as they switch to cloud services.