The hottest Data Substack posts right now

And their main takeaways
Category
Top Literature Topics
Chartbook 472 implied HN points 23 Nov 25
  1. Oracle is investing heavily in AI, promising to spend billions on chips and data centers. This shows they're serious about competing in the AI market.
  2. China is experiencing slow loan growth, which might indicate economic challenges ahead. It's important to watch how this trend unfolds.
  3. There's a feeling of gloom about places like Penn Station, suggesting that urban areas could be facing tougher times ahead. It's a reminder to pay attention to our public spaces.
Big Technology 5629 implied HN points 12 Dec 24
  1. The competition between the U.S. and China in AI will heat up, with each country trying to promote their AI technology globally. This battle will affect which AI systems become the global standard.
  2. In 2025, we might see AI agents become more useful in everyday life, helping with tasks like managing emails and planning trips. People will likely start trusting these agents to handle bigger parts of their work and personal lives.
  3. Military use of AI is expected to grow significantly, with AI agents being implemented to process large amounts of data and improve logistical operations. This could change how wars are fought and complicate decisions about military autonomy.
OSS.fund Newsletter 113 implied HN points 29 Jan 26
  1. AI-powered semantic layers can query messy, fragmented systems and deliver unified read-only insights fast, making many long master-data consolidation projects unnecessary for read-heavy analytics.
  2. You still need traditional MDM for writes, transactional consistency, and regulatory requirements like GDPR, because semantic abstraction doesn’t tell you where to update or delete authoritative records.
  3. A practical approach is to segment use cases into read vs write, run semantic tests on top business questions to capture immediate value, and invest in targeted MDM only for the write/compliance-critical scenarios.
The Bear Cave 303 implied HN points 04 Dec 25
  1. Pattern Group claims to use technology and data to help brands sell better on e-commerce platforms, but many say it's just a middleman selling products on Amazon.
  2. The company's business model, which involves buying from brands at wholesale prices and reselling at retail prices, has slim profit margins and isn't easy to grow.
  3. During an interview, the CEO struggled to explain how the business works, leading some to question if it's worth investing in.
Chartbook 429 implied HN points 14 Nov 25
  1. AI is being integrated into the workforce in various ways, influencing how jobs are done.
  2. There's a focus on understanding the global impacts of extreme weather, like hail, on different regions.
  3. Historical contexts, such as pre-tomato Italy, provide interesting insights into how food and medicine have evolved over time.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
Don't Worry About the Vase 1344 implied HN points 31 Jul 25
  1. The US AI Action Plan is praised for its practical proposals but criticized for its focus on competition, which could harm safety and international cooperation.
  2. There are increasing concerns about the sustainability of offering unlimited AI usage due to high demand and costs, suggesting a shift towards charging based on usage.
  3. Many people still feel uncertain about AI's impact on jobs, with a divide in opinions on whether it will create or eliminate more opportunities in the future.
The Bear Cave 326 implied HN points 20 Nov 25
  1. Sportradar is a big player in sports tech, helping sportsbooks with data and software. They work with major sports leagues to provide real-time data for betting.
  2. There's concern that Sportradar might be involved with shady gambling operations, even while claiming to monitor fair play. They have partnerships that may not always align with regulated markets.
  3. With growing competition and complex regulations, investors are warned not to overlook the potential challenges faced by Sportradar as they navigate the gambling world.
Mindful Modeler 639 implied HN points 23 Apr 24
  1. Different machine learning models exhibit varying behaviors when extrapolating features, influenced by their inductive biases.
  2. Inductive biases in machine learning influence the learning algorithm's direction, excluding certain functions or preferring specific forms.
  3. Understanding inductive biases can lead to more creative and data-friendly modeling practices in machine learning.
Taylor Lorenz's Newsletter 1552 implied HN points 20 Jun 25
  1. Data collection is everywhere online, and companies can take your personal information and share it with the government. This could be used to track what you do or even how you express yourself.
  2. The U.S. government is trying to create a centralized platform to buy sensitive personal data about citizens. This raises serious concerns about privacy and freedom of speech.
  3. It's really easy for people to find your personal information online, so using services like DeleteMe can help keep your data safe by removing it from brokers who sell it.
Generating Conversation 46 implied HN points 12 Feb 26
  1. Make tasks tiny: small, incremental units of work let users catch mistakes early, build trust, and produce dense feedback that powers a strong data advantage.
  2. A low‑stakes autocomplete/IDE UX makes it easy to accept or reject suggestions, so even imperfect prompts save time and generate lots of useful training signals.
  3. Design agents for fast iteration and cumulative correctness rather than one‑shot perfection — cheap inference and quick feedback loops let users get to the right answer over a few tries and move much faster.
Marcus on AI 3636 implied HN points 10 Dec 24
  1. Sora struggles to understand basic physics. It doesn't know how objects should behave in space or time.
  2. Past warnings about Sora's physics issues still hold true. Even with more data, it seems these problems won't go away.
  3. Investing a lot of money into Sora hasn't fixed its understanding of physics. The approach we're using to teach it seems to be failing.
Jacob’s Tech Tavern 2624 implied HN points 27 Jan 25
  1. Never store API keys on the client side. It's too easy for hackers to get them that way.
  2. If bad actors steal your API keys, they can run up big bills on your account.
  3. To keep your data safe, always follow basic security rules and store sensitive information securely.
Anima Mundi 185 implied HN points 10 Dec 25
  1. AI is reshaping priorities in the economy, making human needs less important as machines take the lead. People are adjusting to this new reality where they are secondary.
  2. The physical demands of AI are causing environmental and geopolitical issues. Data centers consume vast amounts of electricity and water, often at the expense of local communities.
  3. As AI becomes more capable, human roles are diminishing, and this could lead to many people becoming economically unnecessary. We need to rethink our values and recognize human worth beyond just economic productivity.
Don't Worry About the Vase 2374 implied HN points 13 Feb 25
  1. The Paris AI Anti-Safety Summit failed to build on previous successes, leading to increased concerns about nationalism and lack of clear plans for AI safety. It's making people worried and hopeless.
  2. Elon Musk's huge bid for OpenAI's assets complicates the situation, especially as another bid threatens to overshadow the original efforts to secure AI's future.
  3. OpenAI is quickly releasing new versions of their models, which brings excitement but also skepticism about their true capabilities and risks.
Silver Bulletin 39 implied HN points 09 Feb 26
  1. An associate editor position (initially part-time, with the potential to expand) will focus on editing others' work, commissioning and editing freelancers, shaping style and editorial planning, and doing quality control on data, charts, and models.
  2. Applicants need at least two years of editing experience, a strong interest in topics like electoral politics and sports, and a precise, statistics-savvy eye for data and factual accuracy.
  3. The job pays $45–55/hour for roughly 15–20 hours per week with a 50-hour minimum guarantee, requires US work eligibility and weekday availability, and has an application deadline of Feb 24 with interviews in early March.
12challenges 171 implied HN points 17 Dec 25
  1. MARCOS is a simple crowdsourced system and web tool that maps which train carriage door corresponds to which station exit so you know exactly where to stand.
  2. If the data is made free and global it could save commuters small amounts of time every day and make stations easier to navigate for parents, elderly people, and busy travelers.
  3. The project is currently empty and needs help — people can star the GitHub, add stations via pull requests, and share it widely, but the effort is meant to be a Secret Santa surprise for Marcos.
Import AI 439 implied HN points 29 Apr 24
  1. Chinese researchers introduced MMT-Bench, a benchmark for assessing visual reasoning in language models with diverse tasks and scenarios.
  2. Researchers developed a system to turn 2D photos into 3D gameworlds, showing AI's capability to transform real-world imagery into interactive experiences.
  3. A consortium of researchers addressed 213 AI safety challenges across 18 areas, emphasizing the urgent need for solutions to ensure the reliability and safety of language models.
One Useful Thing 2229 implied HN points 26 Jan 25
  1. When choosing an AI, consider using a paid version for better features. Claude, Gemini, and ChatGPT are the top choices right now.
  2. New AI advances include live interaction and reasoning capabilities. This helps AIs understand and respond more naturally, making them feel more human.
  3. Privacy is now better handled by major AI models, and you can customize them for your specific needs. Explore different AIs to find one that fits your style.
The AI Frontier 459 implied HN points 11 Apr 24
  1. You can't really set yourself apart with just AI models because they're becoming similar across different companies. What matters more is the unique data you use to feed those models.
  2. Even if your prompts seem special, they won't give you a long-term advantage. Competitors can quickly figure out how to improve their prompts, making them less valuable for differentiation.
  3. To succeed in building AI applications, focus on understanding and using your customers' data effectively. Good data engineering can really make a difference in how well your application performs.
Gradient Flow 259 implied HN points 30 May 24
  1. GraphRAG enhances traditional RAG by incorporating knowledge graphs, improving content retrieval and answer generation for complex queries.
  2. GraphRAG offers various architectures like knowledge graph with semantic clustering, knowledge graph and vector database integration, and knowledge graph-based query augmentation for different applications.
  3. Building a comprehensive knowledge graph comes with challenges like domain understanding, data quality, and evolving data sources, requiring significant resources and expert knowledge.
Odds and Ends of History 2077 implied HN points 17 Jan 25
  1. AI can help local councils find and fix potholes more efficiently. It uses cameras and algorithms to spot problems without needing workers to stop and inspect manually.
  2. The technology can identify not only potholes but also other issues like broken signs and overgrown vegetation. This means councils can be proactive in road maintenance.
  3. Using AI for road maintenance can save time and resources for councils. This allows them to collect useful data and prioritize repairs better, despite limited budgets.
Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots 39 implied HN points 22 Aug 24
  1. Graphs help show complicated data in a simple way. By using nodes and edges, you can easily see how everything connects.
  2. No-code tools let anyone, even those without programming skills, create complex workflows. This makes development quicker and more accessible for everyone.
  3. There's a growing need for tools that can organize and connect different AI flows. This would help everything work better together and solve problems more effectively.
Teaching computers how to talk 167 implied HN points 03 Dec 25
  1. Language models are just predictions and approximations of text, which means they can sometimes make up information that sounds believable but isn't true.
  2. These models don't understand the world the way humans do; they only see words related to other words, so they can get confused easily and not follow conversations well.
  3. People who develop language models try to make them safer, but sometimes these systems can be tricked, and that’s a serious concern since they can't truly differentiate between safe and dangerous content.
Alberto Cairo's The Art of Insight 279 implied HN points 10 May 24
  1. Reducing complexity in data visualization can lead to oversimplifying important human stories. It's essential to remember that simplification can erase important details that affect people's lives.
  2. The history of data visualization is linked to darker aspects of society, like slavery and eugenics. Recognizing this helps us understand the impact of our tools and the stories we choose to tell.
  3. Visualization can be a powerful tool to reveal new insights when used correctly. By learning from the past, we can aim to avoid repeating mistakes and address inequalities.
Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots 39 implied HN points 19 Aug 24
  1. Graph-based representations are becoming popular in AI, making it easier to visualize application flows and manage data relationships. This helps in understanding complex connections between data points.
  2. There are two ways to create graph representations: one is using code to create a visual flow, and the other is using a graphical user interface (GUI) to build the flow directly. This dual approach caters to different needs and levels of user expertise.
  3. Graph data structures allow for both firm control over applications and the flexibility needed for agent-based systems. This is useful for tasks where interactions and decisions must adapt based on inputs or user approvals.
Generating Conversation 210 implied HN points 06 Nov 25
  1. The costs of using AI models are not dropping as quickly as before, which means businesses need to be more careful about managing their expenses. Companies might have to focus on their profit margins and find ways to optimize expenses.
  2. Choosing the right AI model is becoming more important because they are getting more specialized. Users need to think carefully about which models to use for specific tasks to get the best performance and cost-effectiveness.
  3. AI service usage can be unpredictable, so companies will need to adapt to changing demand patterns for resources. This may involve new pricing strategies to better reflect the complexity of different tasks and ensure efficiency.
Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots 59 implied HN points 31 Jul 24
  1. OpenAI bought Rockset to make their data retrieval system better, which helps in using AI more effectively.
  2. The acquisition shows that LLMs are being seen more like a tool, and the focus is shifting to building useful applications using these technologies.
  3. Rockset's technology will help OpenAI work better with developers and make it easier to access and use real-time data for AI products.
The Data Ecosystem 179 implied HN points 26 May 24
  1. A business strategy is the game plan for a company to reach its goals. It involves having a clear vision, mission, and set of goals to guide the organization.
  2. Good business strategies have defined components that everyone in the company knows. This helps avoid confusion and keeps everyone focused on the same objectives.
  3. Data plays a crucial role in shaping modern business strategies. Companies need to integrate data and analytics into their plans to make informed decisions and stay competitive.
First 1000 1041 implied HN points 28 Feb 23
  1. Let the 1% help you build, they're probably more willing than you think
  2. Reward the 9% for their efforts, they just want to know they'll be recognized
  3. Make the 90% feel something, sometimes emotion is more powerful than utility
Peter Boghossian 1041 implied HN points 02 May 23
  1. The news media and public figures can create inaccurate narratives that influence perceptions.
  2. Educating people about accurate data is crucial to addressing social issues like crime and policing.
  3. Examining and fact-checking data can reveal insights that challenge popular movements and ideologies.
Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots 39 implied HN points 12 Aug 24
  1. OpenAI has improved its API to ensure that outputs always match a set JSON format. This helps developers know exactly what kind of data they will get back.
  2. The previous method of generating JSON outputs was inconsistent, making it hard to use in real-world applications. Now, there's a more reliable way to create structured outputs.
  3. Developers can now use features like Function Calling and a new response format to make their apps interact better with AI, ensuring clearer communication between systems.
Peter Boghossian 982 implied HN points 05 May 23
  1. Young men, specifically black Americans, are disproportionately involved in gun violence in the US.
  2. Out-of-wedlock birth rates are a significant factor in contributing to violence, particularly in the black community.
  3. There is a need to address the root causes of rising out-of-wedlock birth rates, which spiked after 1963, to prevent further violence.
Brad DeLong's Grasping Reality 392 implied HN points 09 Aug 25
  1. AI can be incredibly useful, but it's still very different from human thinking. We need to learn how to recognize its mistakes and make the most of its capabilities.
  2. Talking to AI can be like having an unusual roommate. It may sometimes give strange answers, but with patience, we can learn how to get better results.
  3. It's important to be both curious and critical when using AI. We should explore what it can do while also being aware of its limits.
Source Code by Fume 22 HN points 26 Aug 24
  1. Many people have different views on the future of AI; some believe it will change a lot soon, while others think it won't become much smarter. It's suggested that rather than getting smarter, AI will just get cheaper and faster.
  2. There's a concern that large language models (LLMs) might not be improving in reasoning skills as expected. They have become more affordable over time, but that doesn't necessarily mean they are getting better at complex tasks.
  3. The Chinese Room Argument highlights that AI can follow instructions without understanding. Even if AI tools become faster, they might still lack the creativity to generate unique ideas, but they can still help with routine tasks.
FutureIQ 3 implied HN points 13 Mar 26
  1. Trust wins in high-stakes fields: using credentialed sources and training models only on vetted, domain‑specific literature (not the open internet) makes professionals trust the system and cuts hallucinations.
  2. Own exclusive data and build a flywheel: getting top practitioners and journals to use and partner creates unique, high‑quality signals that improve the product and attract more users and partners.
  3. Capture tacit, time‑sensitive context to monetize defensibly: real‑time usage data and tight integrations let you offer services big generalist models can’t replicate, creating a deep, hard‑to‑clone moat.
Marcus on AI 3398 implied HN points 17 Feb 24
  1. Large language models like Sora often make up information, leading to errors like hallucinations in their output.
  2. Systems like Sora, despite having immense computational power and being grounded in both text and images, still struggle with generating accurate and realistic content.
  3. Sora's errors stem from its inability to comprehend global context, leading to flawed outputs even when individual details are correct.
Enterprise AI Trends 379 implied HN points 07 Aug 25
  1. OpenAI is combining all its models into one, called GPT-5, which makes things easier for users since they won’t need to choose from different versions anymore.
  2. This new model setup helps OpenAI save money by managing costs better and keeping everything efficient, like a smart system that uses just the right amount of power for each task.
  3. With GPT-5 being cheaper and better than some competitor models, it pushes other companies, like Anthropic, to innovate and lower their prices to stay competitive.
Topsoil 550 implied HN points 06 Jan 24
  1. Precision agriculture uses technology to adjust equipment for field variability, improving efficiency.
  2. Precision agriculture offers benefits like increased yields, time savings, and environmental sustainability.
  3. While valuable, precision agriculture is not a one-size-fits-all solution and adoption can be complex.
Mindful Modeler 399 implied HN points 20 Feb 24
  1. Generalization in machine learning is essential for a model to perform well on unseen data.
  2. There are different types of generalization in machine learning: from training data to unseen data, from training data to application, and from sample data to a larger population.
  3. The No Free Lunch theorem in machine learning highlights that assumptions and effort are always needed for generalization, and there's no free lunch when it comes to achieving further generalization.