The hottest Data Substack posts right now

And their main takeaways

o3 will see you now

Alex's Personal Blog • 0 implied HN points • 20 Dec 24

🕹 Technology Data

OpenAI's new model, o3, shows significant improvements in programming tasks and exam scores. It indicates that AI is advancing fast and can tackle challenging problems.
Inflation rates are slightly lower than expected, which might affect consumer spending and interest rates. However, the markets seem to recover despite this uncertainty.
Elon Musk is building ties with various right-wing political groups in Europe. His support for these parties suggests a trend toward anti-immigration and nationalistic policies.

Sort Product Update – January 2025

Database Engineering by Sort • 0 implied HN points • 06 Jan 25

🕹 Technology Data

The new Data Explorer is designed to be user-friendly and looks similar to a spreadsheet, making it easier to manage data. You can filter rows and propose changes quickly with just a few clicks.
A feature called 'Describe Changes' allows users to detail updates to data in simple language, like changing a customer’s address. The improvements also make it easier to view these described changes.
The founders encourage user feedback and suggestions for future updates, highlighting their commitment to improving the platform.

2024's greatest hits on Air Street Press

Guide to AI • 0 implied HN points • 29 Dec 24

🕹 Technology Data

Data acquisition is crucial for AI startups. It's important to know different methods like using synthetic data and scraping from various sources.
Strong storytelling helps tech companies succeed. Good story-telling is needed to explain technology and attract support.
AI's energy needs are growing, making nuclear energy a potential solution. However, the speed of building new infrastructure to support it must improve.

Revolutionize Your Ops: How Sort Transforms Data Chaos into Operational Excellence

Database Engineering by Sort • 0 implied HN points • 21 Jan 25

💼 Business Data

Sort is a database tool that helps operations teams manage their data better. It keeps everyone on the same page with up-to-date information.
With Sort, teams can quickly track changes and resolve issues together, reducing confusion and improving teamwork.
Using Sort can lead to faster decisions, fewer mistakes, and overall better efficiency in operations management.

I Know PostgreSQL Now

My Makerspace • 0 implied HN points • 02 Feb 25

🕹 Technology Data

Using PostgreSQL 10 from amazon-linux-extras can save you a lot of hassle. It's simple and works well in AWS Lambda.
Newer versions of PostgreSQL can cause issues, so it's often better to stick with stable, older versions.
Make sure to set up your VPC correctly to connect to Aurora. Also, always use environment variables for your database credentials.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Good LLM use cases

Expand Mapping with Mike Morrow • 0 implied HN points • 12 Feb 25

🕹 Technology Data

Many people are trying to use LLMs, but often they aren't sure what problems to solve. It's important to find the right match between the tool and the issue.
LLMs can be really useful for tasks like translation, helping people find information, and working with data. These are some of the best ways to use them.
Successful LLM applications will focus on these core uses. It's all about using the technology for what it does best.

FortuneGPT: When AI Reads Your Future (and Your Slack Messages)

Phoenix Substack • 0 implied HN points • 25 Feb 25

🕹 Technology Data

FortuneGPT mixes tarot reading with AI to predict your future based on your data and habits. It's like having a digital fortune teller who uses real information to give you insights.
The app learns from each reading, becoming better at understanding your worries over time. It can adjust its advice based on your mood and past decisions.
FortuneGPT offers a free version and multiple paid plans that upsell deeper insights and predictions. It's designed to keep users engaged and curious, almost like a subscription service for mystical insights.

Seven Highlights from AMD's CEO Dr Lisa Su

More Than Moore • 0 implied HN points • 24 Feb 25

🕹 Technology Data

AMD expects their AI business to grow to over $10 billion a year. This shows they are really focusing on artificial intelligence as a big part of their future.
They are planning to create an AI Developer Cloud, which will help developers access tools for building AI applications. This could make it easier for more people to work on AI projects.
AMD believes that training AI models will be the main focus in 2025. This means they are shifting gears from just inference tasks to actually training the models needed for AI.

Libraries release historical book datasets for AI training

philsiarri • 0 implied HN points • 15 Jun 25

🕹 Technology Data

Libraries are releasing old books in the public domain to help train AI models. This includes tons of books from many languages, going back to the 15th century.
Using these public texts helps avoid legal problems tech companies face when they use copyrighted material. It also can improve the quality and reliability of AI.
The dataset from Harvard is available for anyone to use on the Hugging Face platform. This gives researchers and developers a valuable resource for their AI projects.

Big Tech Digest #21 💥

Big Tech Digest • 0 implied HN points • 15 Jul 25

🕹 Technology Data

AI can sometimes make job candidates seem overly perfect in interviews. It's important to know how to spot AI-generated responses to ensure fair hiring.
Team leaders may face skepticism when introducing AI tools. Having strong conviction and clear communication can help in gaining team acceptance.
Optimizing technology, like reducing latency in a service or improving performance in software, can result in significant benefits, making systems faster and more efficient.

The Retrieval-AI Shift: Win on Cost, Compliance, Career

OSS.fund Newsletter • 0 implied HN points • 10 Jul 25

🕹 Technology Data

Retrieval-Augmented Generation (RAG) is becoming the preferred choice for businesses because it's much cheaper and faster than traditional methods.
With RAG, roles in companies are changing. Workers will focus more on creative tasks and less on data collection and routine analysis.
Skills related to RAG are very much in demand now, with companies looking for people who understand new tools and can design effective systems.

Cybersecurity judo, Nvidia chip procurement, and fishy antibody strategies

The Strategy Toolkit • 0 implied HN points • 23 Jun 25

🕹 Technology Data

Cybersecurity can sometimes turn threats into opportunities. Just like in martial arts, using an attacker's strength against them can be effective.
Some hackers are now using open source tools to carry out cyber attacks. This helps them blend in and avoid detection from cybersecurity teams.
New tools, like ECHO, are helping to automate the removal of malware quickly. This tool can resolve issues in minutes instead of days, making it easier to protect networks.

Context in LLMs and the blockchain

networked • 0 implied HN points • 16 Jul 25

🕹 Technology Data

Large language models (LLMs) and blockchains both need current information to stay relevant. LLMs are trained on data but can quickly become outdated, while blockchains can hold data forever but can't verify if that data is actually accurate.
To make LLMs and blockchains more useful, they need to access real-world information. LLMs now use tools like web searches to update their knowledge, and blockchains use oracles to get outside data.
However, LLMs are still useful even without real-time data, while blockchains rely heavily on external information. This difference shows how LLMs can operate independently with their own capabilities.

Agentic AI: When Your AI Starts Making Plans (And You Don’t Like Them)

Phoenix Substack • 0 implied HN points • 23 Jul 25

🕹 Technology Data

Agentic AI can act on its own, making it different from traditional AI. It can take actions like scheduling meetings and managing contractors without asking for permission.
Security is a big concern with agentic AI because it can be tricked by manipulated data. It's important to remember that you can't just set up a traditional firewall to protect against these smarter agents.
To stay safe, companies should focus on creating unstable and adaptable AI systems. This means regularly updating and changing their systems to prevent AI from becoming too comfortable or predictable.

The Rise of Shadow Datasets in AI

The PhilaVerse • 0 implied HN points • 31 Jul 25

🕹 Technology Data

AI development is now focusing on the quality of training data instead of just collecting more data. Having the right data is more important than having a lot of it.
Organizations are creating exclusive and specialized datasets that can't be easily copied. This makes the training of AI models more unique.
These curated datasets are becoming crucial for how AI systems are judged and compared in the industry. They help differentiate between different AI models.

The Orchestration Layer Is the New Operating System

OSS.fund Newsletter • 0 implied HN points • 07 Aug 25

🕹 Technology Data

The orchestration layer is becoming the main focus for AI in businesses. Companies that control the workflow can better manage budgets and resources.
AI models are cheap and common, making workflow orchestration more valuable. The companies that successfully manage these workflows will gain a big edge over others.
Investors are now looking at how well a company manages workflows, rather than just the technology itself. This means that being good at running the flow can lead to better business outcomes.

Get on your soapbox

davidj.substack • 0 implied HN points • 12 Aug 25

🕹 Technology Data

Historically, people shared messages publicly by speaking to crowds in person since most weren't literate. This made direct communication important.
As technology advanced, broadcasting to larger audiences became possible, but the challenge has always been making messages relevant to everyone.
With tools like AI, we can now address individuals personally based on their preferences, which could make communication more engaging or even manipulative.

20250810

Curious futures (KGhosh) • 0 implied HN points • 10 Aug 25

🕹 Technology Data

AI tools in software development might actually slow down experienced developers rather than speeding them up. This can be surprising since many hoped for a boost in efficiency.
To survive in a tech-driven world, skills like collaboration, creativity, and cunning are becoming more important. This can help people tackle challenges posed by cybersecurity threats.
The world is blending technology with creativity in funny and unexpected ways. From AI-produced shows to quirky corporate competitions, there's a lot of absurdity mixed with innovation.

How a Web2 Company Uses Crypto to Power Open Data APIs

Olshansky's Newsletter • 0 implied HN points • 22 Aug 25

🔮 Crypto Data

There's a chance to create a main hub for finding open data APIs, similar to how Google helps us find websites.
Currently, there's no real marketplace for APIs, making it hard for developers to find what they need.
Two main things are needed: a way to easily find APIs, and assurance that the data they provide is reliable and high-quality.

#36: When AI Starts Paying Rent

OSS.fund Newsletter • 0 implied HN points • 13 Nov 25

🕹 Technology Data

AI projects should focus on delivering real, measurable value instead of just being interesting experiments. A good example is setting a clear payback target and sticking to it.
Using AI in existing systems without requiring big changes can lead to better adoption and effectiveness. It’s better to integrate with what works rather than trying to overhaul everything.
Having clear governance and keeping track of costs is essential when scaling AI. This means knowing who makes decisions and monitoring performance closely to quickly address any issues.

Poisoning LLMs, parasitic ants, & AI technology stack strategies

The Strategy Toolkit • 0 implied HN points • 01 Dec 25

🕹 Technology Data

Even a small number of bad documents in training data can harm large language models. Just 250 malicious documents can create serious security issues.
The risk of poisoning attacks doesn't increase with the size of the model. This means defenses against such attacks are essential for all models, big or small.
Current findings suggest that keeping training data clean and safe is crucial, as small amounts of poison can easily compromise model safety.

How Marx’s Theory of the "Commons" Explains the AI Crisis

Experiments with NLP and GPT-3 • 0 implied HN points • 22 Dec 25

🕹 Technology Data

Big AI companies scrape the open internet and turn shared human-created content into private, proprietary models, effectively enclosing the digital commons. This happens without creators' meaningful consent, so a public resource is being turned into corporate capital.
Creators and workers are being pushed into a digital proletariat: they lose control over their work, see its value squeezed, and often must work for or compete against AI built on their labor. This creates alienation where people may have to pay to use models trained on their own contributions.
Regulation and licensing can legally lock in big firms' advantages like modern enclosure acts, making it hard for smaller or open alternatives to compete. At the same time the internet's creative ecosystem risks depletion, since if humans stop producing, AI could end up training on its own output and ruin the system.

Apps Are Dead; They Just Don’t Know It Yet

FREST Substack • 0 implied HN points • 10 Mar 26

🕹 Technology Data

Apps as isolated containers are becoming unmanageable because AI makes building software cheap, so organizing your digital life around thousands of separate apps won’t scale.
The app model arose from economic moats like hard distribution and costly infrastructure, and those moats are eroding as infrastructure is commoditised and AI lowers development costs.
The future is fluid computation over shared data, where AI lets you manipulate any data across tools and interfaces without being locked into individual apps.