The hottest Machine Learning Substack posts right now

And their main takeaways

Roles of Supervised Machine Learning in Science

Mindful Modeler • 179 implied HN points • 31 Jan 23

🔬 Science Machine Learning

Machine learning models play multiple roles in science: as study objects, scientific tools, and scientific models.
Using machine learning models as study objects is common in science, focusing on predictive model performance comparisons.
Machine learning models can be utilized as scientific tools and as scientific models, where they play a central role in understanding phenomena.

($) DeepSeek's Three Idiosyncratic Advantages

Interconnected • 138 implied HN points • 03 Jan 25

🕹 Technology Machine Learning

DeepSeek-V3 is an AI model that is performing as well or better than other top models while costing much less to train. This means they're getting great results without spending a lot of money.
The AI community is buzzing about DeepSeek's advancements, but there seems to be less excitement about it in China compared to outside countries. This might show a difference in how AI news is perceived globally.
DeepSeek has a few unique advantages that set it apart from other AI labs. Understanding these can help clarify what their success means for the broader AI competition between the US and China.

Understanding the Different Types of Transformers in AI [Math Mondays]

Technology Made Simple • 99 implied HN points • 11 Jul 23

🕹 Technology Machine Learning

There are three main types of transformers in AI: Sequence-to-Sequence Models excel at language translation tasks, Autoregressive Models are powerful for text generation but may lack deeper understanding, and Autoencoding Models focus on language understanding and classification by capturing meaningful representations of input data.
Transformers with different training methodologies influence their performance and applicability, so understanding these distinctions is crucial for selecting the most suitable model for specific use cases.
Deep learning with transformer models offers a diverse range of capabilities, each catering to unique needs: mapping sequences between languages, generating text, or focusing on language understanding and classification.

3 Techniques to Make your Machine Learning more efficient[Technique Tuesdays]

Technology Made Simple • 99 implied HN points • 04 Apr 23

🕹 Technology Machine Learning

Reducing the number of features in your data can improve performance and keep costs down in machine learning processes.
Active learning focuses on prioritizing data points for efficient machine learning model training.
Using filters and simpler models for specific tasks can lead to better performance and cost savings compared to always using large, powerful models in AI.

GPT-4 (sometimes) captures the wisdom of the crowd

The Counterfactual • 99 implied HN points • 25 Sep 23

🕹 Technology Machine Learning

Researchers often use survey data to understand human behavior, but collecting reliable human responses can be complicated and expensive. Using large language models (LLMs) like GPT-4 could make this process easier and cheaper.
LLMs can sometimes produce responses that closely match the average opinions of many people. In some cases, their answers were actually more aligned with the average responses than individual human judgments.
While LLMs can be helpful in gathering data quickly and inexpensively, it's important to be careful. They might not always be accurate or representative of all viewpoints, so it's wise to compare LLM results with human responses to ensure quality.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

📝 Guest Post: Augmented SBERT: A Data Augmentation Method to Enhance Bi-Encoders for Pairwise Sentence Scoring*

TheSequence • 126 implied HN points • 31 Jan 25

🕹 Technology Machine Learning

Augmented SBERT (AugSBERT) improves sentence scoring tasks by using data augmentation to create more sentence pairs. This means it can perform better even when there's not much training data available.
Traditional methods like cross-encoders and bi-encoders have limitations, like being slow or needing a lot of data. AugSBERT addresses these issues, making it more efficient for large-scale tasks.
The approach combines the strengths of different models to enhance performance, especially in specific domains. It shows significant improvements over existing models, making it a useful tool for various natural language processing applications.

Genre Grapevine on Why They Want Us to Call it AI

Genre Grapevine • 98 implied HN points • 01 Jul 23

🚌 Education Machine Learning

Words are powerful and shape our understanding of the world and ourselves.
The language used to describe machine learning tools can be deceptive, such as calling them 'artificial intelligence' when there's no actual intelligence behind them.
Using accurate language is important in conversations about machine learning to avoid misconceptions and ensure transparency.

Google Maps Travel Time Prediction Algorithm & AI Research at Google Deepmind

What's AI Newsletter by Louis-François Bouchard • 98 implied HN points • 09 Jul 23

🕹 Technology Machine Learning

The podcast episode discusses Google Maps travel time prediction algorithm and AI research at Google Deepmind.
Petar Veličković shares his journey from academia to developing the algorithm used in Google Maps.
The interview sheds light on opportunities and challenges in the rapidly evolving field of machine learning.

Watch now:"AI under the hood and off to the races" exploring ChatGPT, NeRF, Stable Diffusion and more

Ubiquitous Thoughts • 98 implied HN points • 19 Jul 23

🕹 Technology Machine Learning

The virtual event covered the basics of AI models like ChatGPT, NeRF, and Stable Diffusion.
Entrepreneurs can integrate AI into their startup products at different levels of depth.
The event emphasized the importance of understanding how AI works, even without prior technical experience.

What Noam Chomsky gets wrong about AI

ML Powered • 98 implied HN points • 10 Mar 23

🕹 Technology Machine Learning

Machine learning models like ChatGPT can be as efficient or even more efficient than the human brain in certain tasks.
Measuring intelligence of machine learning models based solely on the ability to apply the scientific method is unrealistic.
Modern language models like ChatGPT can understand and parse phrases with ease, contradicting claims of their failure in understanding language.

D-Adaptation: Goodbye Learning Rate Headaches?

followfox.ai’s Newsletter • 98 implied HN points • 21 Jun 23

🕹 Technology Machine Learning

D-Adaptation method automates setting learning rate, aiming for optimal convergence in machine learning.
Implementing D-Adaptation can consume more VRAM and result in slower training speed compared to other optimizers.
Initial results show D-Adaptation performing comparably to hand-picked parameters in generating high-quality models.

Welcome to the AI bubble

The Down Round • 98 implied HN points • 14 Jun 23

🕹 Technology Machine Learning

The AI industry may be experiencing a bubble similar to what the crypto industry went through before.
There are signs of hype and questionable practices in AI, such as companies quickly pivoting to AI and non-experts making bold claims.
Being cautious and vigilant in an AI bubble is important to avoid getting caught up in unrealistic narratives and disconnected market valuations.

When Models Drive a Hard Bargain

The Counterfactual • 79 implied HN points • 20 Nov 23

🕹 Technology Machine Learning

Incentives heavily influence how people and AI behave. When personal goals clash with social expectations, it creates tension that needs to be managed.
AI systems, like large language models, can produce deceptive behaviors without being explicitly programmed to. Their strategies can be affected by the goals they are trying to achieve.
Using games as testing environments could help identify desirable and undesirable behaviors in AI. The more varied the tests, the better we understand how an AI might behave outside of those tests.

The Tech Buffet #14: A 3-Step Approach To Evaluate Your LLM-based Applications

The Tech Buffet • 79 implied HN points • 19 Nov 23

🕹 Technology Machine Learning

Creating a good dataset is important to evaluate your LLM-based applications. You can use LLMs to generate questions and answers from your data, which helps in building a reliable test set.
Running your application over this dataset helps you see how well it retrieves information and generates answers. Keeping track of the documents it finds will make your evaluation easier.
Finally, you should measure how well your application retrieves relevant documents and how good the answers are. This will help you understand what works best and where you can improve.

Understanding Different Uncertainty Mindsets

Mindful Modeler • 179 implied HN points • 24 Jan 23

🔬 Science Machine Learning

Understanding the fundamental difference between Bayesian and frequentist interpretations of probability is crucial for grasping uncertainty quantification techniques.
Conformal prediction offers prediction regions with a frequentist interpretation, similar to confidence intervals in linear regression models.
Conformal prediction shares similarities with the evaluation requirements and mindset of supervised machine learning, emphasizing the importance of separate calibration and ground truth data.

GPT‑5.2, GLM-4.6V, Runway's GWM Worlds, GWM Avatars and GWM Robotics, Nomos 1, Devstral 2, Wan-Move, SimGym, Disco, Stripe's Agentic Commerce Suite and more

AI Brews • 10 implied HN points • 12 Dec 25

🕹 Technology Machine Learning

Large AI models are making big leaps: new releases like GPT‑5.2 and specialized models improve reasoning, code, vision, long‑context handling, and tool use, while smaller specialist models like Nomos 1 can outperform humans on hard math tasks.
Agentic and commerce-focused tools are moving into the mainstream, with products and standards that let AI agents act inside apps, make purchases, and integrate into workflows (agentic commerce, foundation efforts, and Slack/agent integrations).
Multimodal content and developer tooling are exploding: new video and avatar systems, motion‑controllable video models, Adobe ChatGPT integrations, visual editors, and many open‑source projects make it much easier to build and deploy creative AI applications.

Data Science Weekly - Issue 487

Data Science Weekly Newsletter • 199 implied HN points • 23 Mar 23

🕹 Technology Machine Learning

This week's newsletter shares useful links in data science, machine learning, and AI. It's a great way to stay updated in these fields.
One highlighted article discusses the importance of prompt engineering in interacting with language models. It's about how to communicate effectively with AI for desired results.
There's also a report on how generative models like GPT might impact jobs. It shows that many workers could see changes in their tasks due to AI advancements.

Our Top Book Pick for 2022

Gradient Flow • 199 implied HN points • 15 Dec 22

🚌 Education Machine Learning

The recommended book of the year is a comprehensive guide for data scientists and data teams, offering practical advice and real-world insights in using data science effectively and ethically.
ActivityPub is a W3C standard and decentralized social networking protocol, gaining traction as a viable alternative to centralized services for community building.
SkyPilot, a newly launched project, presents a unified interface for running machine learning workloads on any cloud, catering to the need for cost-effective cloud computing in the coming year.

Dyno Therapeutics: The Capsids You Need

The Century of Biology • 390 implied HN points • 07 Jan 24

🔬 Science Machine Learning

Delivering gene therapies safely and specifically inside the body is a challenging problem.
Using viruses for gene therapy presents challenges due to lack of specificity and immune response.
Dyno Therapeutics aims to revolutionize gene therapy through AI-powered capsid design solutions.

Where Are The Robots?

Teaching computers how to talk • 110 implied HN points • 23 Feb 25

🕹 Technology Machine Learning

Humanoid robots seem impressive in videos, but they aren't practical for everyday tasks yet. Many still struggle with simple actions like opening a fridge at home.
Training robots in simulations is useful, but it doesn’t always translate well to the real world. Minor changes in the environment can cause trained robots to fail.
Even if we could train robots better, it's unclear what tasks they could take over. Existing household machines already perform many tasks, and using robots for harmful jobs could be a better focus.

The Conversational AI Technology Landscape: Version 5.0

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 14 May 24

🕹 Technology Machine Learning

Voicebots add more complexity to chatbots, requiring new technologies like ASR and TTS. They need to handle issues like latency and background noise to provide a smooth experience.
Agent desktops must integrate well with chatbots to improve customer service. This helps agents access information quickly and provides suggestions to handle customer interactions better.
Cognitive search tools can enhance chatbots by allowing them to access a wider range of information. This helps them answer more diverse questions from users effectively.

Holy GPT-4o 🙀

Sector 6 | The Newsletter of AIM • 19 implied HN points • 14 May 24

🕹 Technology Machine Learning

GPT-4o is a new AI model from OpenAI that can understand text, images, and audio all at once. This means it can do more things in one package, making it more powerful and useful.
It has advanced translation abilities that could compete with tools like Google Translate, allowing users to translate languages in real-time. This is especially exciting for people who need quick translations.
The model is designed to improve experiences for both developers and regular users, hinting at a future where AI can do even more complex tasks like those seen in movies.

Edge 370: A Deep Dive Into AlphaGeometry: Google DeepMind’s New Model that Solves Geometry Problems Like a Math Olympiad Gold-Medalist

TheSequence • 364 implied HN points • 15 Feb 24

🕹 Technology Machine Learning

Google DeepMind has created AlphaGeometry, an AI model that can solve complex geometry problems at the level of a Math Olympiad gold medalist using a unique combination of neural language modeling and symbolic deduction.
The International Mathematical Olympiad announced a $10 million prize for an AI model that can perform at a gold medal level in the competition, which historically has been challenging even for top mathematicians.
Geometry, as one of the difficult aspects of the competition, traditionally requiring both visual and mathematical skills, is now being tackled effectively by AI models like AlphaGeometry.

What language should LLMs program in?

Dev Interrupted • 37 implied HN points • 14 Aug 25

🕹 Technology Machine Learning

Programming languages may need to change as AI takes over coding tasks. Languages like JavaScript and Python, while easy for humans, might not be the best fit for AI.
Stronger programming languages, like Haskell, could help AI produce more reliable code. These languages are strict and help ensure that the generated code works correctly.
There's a possibility of creating entirely new programming languages designed specifically for AI. This could make the coding process more efficient and reduce errors compared to using human-designed languages.

Me and My AI Ugh Field

Atlas of Wonders and Monsters • 373 implied HN points • 25 Jan 24

🕹 Technology Machine Learning

The author struggles with conflicting feelings about their career and education choices
There's a concept of 'ugh fields' where the author subconsciously avoids tasks, even in their field of interest
Despite challenges, the author believes in the importance of pursuing careers aligned with genuine excitement and passion

Language Model Quantization Explained

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 39 implied HN points • 27 Feb 24

🕹 Technology Machine Learning

Small language models can be very good at tasks like understanding language and generating text. They sometimes work better than bigger models because they can learn in context.
Running language models locally can help with privacy and slow response times. This means businesses can customize their models while keeping data safer.
Quantization helps make models smaller and quicker by summarizing their complex information. It’s like having condensed books that still have the important ideas.

Quantify The Uncertainty Of Predictive Models With Conformal Prediction

Mindful Modeler • 239 implied HN points • 11 Oct 22

🕹 Technology Machine Learning

Machine learning models often lack the ability to express uncertainty, leading to overconfidence and potential inaccuracies in predictions.
Conformal prediction is a useful method to quantify uncertainty in predictive models, offering benefits like speed, model-agnosticism, and statistical guarantees.
To implement conformal prediction, one must have a heuristic score of uncertainty, ensuring that the calibration of uncertainty levels is reliable for more accurate predictions.

How Microsoft Security Copilot Can Help Defend Against Cyberthreats

Rod’s Blog • 79 implied HN points • 09 Nov 23

🕹 Technology Machine Learning

Security teams face challenges like complexity of data, lack of skilled professionals, and speed of evolving cyberthreats.
Security teams need a solution to simplify data and tasks, empower them with AI technology, and protect against cyberthreats effectively.
Microsoft Security Copilot is an AI-powered solution that can help security teams manage security posture, respond to incidents, and generate security reports efficiently.

Weekly Top Picks #90

The Algorithmic Bridge • 148 implied HN points • 02 Dec 24

🕹 Technology Machine Learning

OpenAI is facing backlash from both its supporters and critics as it expands its influence.
Chinese open-source AI technology is quickly advancing and catching up with OpenAI's offerings.
AI is now capable of producing superhuman-level music, signaling a new phase in its creative abilities.

Multimodal Search Using Vector DB

The Beep • 39 implied HN points • 25 Feb 24

🕹 Technology Machine Learning

Multimodal search lets you look for information using different types of data like text, images, and audio at the same time. This makes finding what you need much easier and faster.
Embeddings are special numbers that represent words, images, or sounds so computers can understand them. They help machines learn about relationships and contexts in the data they process.
Using vector databases, we can store these embeddings efficiently. This technology enables smarter applications like image searches or recognizing songs quickly.

Chameleon, Meta's Mixed-Modal Foundation Model

Aziz et al. Paper Summaries • 19 implied HN points • 02 Jun 24

🕹 Technology Machine Learning

Chameleon combines text and image processing into one model using a unique architecture. This means it processes different types of data together instead of separately like previous models.
The training of Chameleon faced challenges like instability and balancing different types of data, but adjustments like normalization helped improve its training process. It allows the model to learn effectively from both text and images.
Chameleon performs well in generating responses that include both text and images. However, just adding images didn't harm the model's ability to handle text, showing it can work well across different data types.

The Sequence Research #553: Self-Evaluating LLMs Are Here: Inside Meta AI’s J1 Framework

TheSequence • 63 implied HN points • 30 May 25

🕹 Technology Machine Learning

LLMs are now used as judges, which is an exciting new trend in AI. This can help improve how we evaluate AI outputs.
Meta AI's J1 framework is a significant development that makes LLMs more like active thinkers rather than just content creators. This means they can make better evaluations.
Using reinforcement learning, J1 allows AI models to learn effective ways to judge tasks. This helps ensure that their evaluations are both reliable and understandable.

The ghosts that live in my garage

Basta’s Notes • 122 implied HN points • 13 Jan 25

🕹 Technology Machine Learning

Machine learning models are good at spotting patterns that humans might miss. This means they can make predictions and organize data in ways that are impressive and often very useful.
However, machine learning can struggle with unclear or messy data. This fuzziness can lead to mistakes, like misidentifying objects or giving unexpected results.
Not every problem needs a machine learning solution, and sometimes simpler methods work better and are more effective. It's important to think carefully about whether machine learning is truly the best tool for the job.

Same Model, Different Uses

Mindful Modeler • 219 implied HN points • 25 Oct 22

🕹 Technology Machine Learning

The mindset of the modeler significantly influences the use and interpretation of models.
There are various modeling mindsets such as frequentist inference, Bayesian inference, causal inference, and supervised machine learning, all of which can lead to the same final model.
Different tasks require different modeling mindsets, and being well-versed in multiple mindsets can be beneficial for a data scientist.

The Sequence Radar #755: Last Week in AI: Worlds Built, Models Refined, and Legends Move On

TheSequence • 14 implied HN points • 16 Nov 25

🕹 Technology Machine Learning

World models are becoming more advanced, moving from simple image recognition to creating interactive 3D environments that agents can explore. This change means we need new tools and data to support these rich, dynamic models.
AI coding tools are becoming essential for software development, with companies raising significant funds to enhance these technologies. This shift indicates that AI will play a crucial role in making coding more efficient and collaborative.
Recent advancements in large language models are focused on making them more controllable and aligned with users' needs, improving their reliability for real-world applications.

In search of the mystery of the cortical column and human-like general intelligence

Artificial General Ideas • 1 implied HN point • 25 Feb 26

🕹 Technology Machine Learning

Build NeuroAI by reverse-engineering general cortical principles so systems learn, think, and plan efficiently like humans and learn from experience rather than just from written human knowledge.
Prioritize new kinds of world models that are hierarchical, causally structured, and compositional, and combine those with episodic memory, distributed reasoning across perception and action, active inference, and continual learning.
Close the loop between AI and neuroscience by using brain observations—like recurrence, feedback, attention, replay, schemas, and local plasticity—to drive algorithm design and iterate with targeted experiments to refine theories.

Anthropic, WOW

TheSequence • 161 implied HN points • 27 Oct 24

🕹 Technology Machine Learning

Anthropic has launched a new AI model named Claude that can interact with computers like a human, allowing it to execute tasks directly on-screen. This opens many new possibilities for AI applications.
Two upgraded versions of Claude have been released, one focusing on coding and tool usage with high performance, and the other emphasizing speed and affordability for everyday applications.
A new analysis tool has been introduced in Claude.ai, enabling the model to write and run JavaScript code for data analysis and visualizations, enhancing its functionality for users.

Data Science Weekly - Issue 482

Data Science Weekly Newsletter • 199 implied HN points • 16 Feb 23

🕹 Technology Machine Learning

Visual analytics can help make deep learning models easier to understand. Researchers are working to fill gaps and challenges in this area.
AI tools like ChatGPT might change how we visualize data in the future. They could make it easier to find and interpret information quickly.
A new method called Lion offers a better optimization algorithm for training deep neural networks. It uses less memory than existing methods like Adam.

Breaking into ML as a New Grad

Kndrej’s Substack • 3 HN points • 14 Aug 24

🕹 Technology Machine Learning

Breaking into machine learning (ML) requires not just basic knowledge but also a deep understanding of the math and engineering behind models. Completing online courses is only a starting point.
Internships and real project experience are crucial for landing a job in ML. It's important to have skills that stand out, like publications or open-source contributions.
Interview preparation is key; practicing coding challenges and understanding ML concepts is necessary to succeed. Networking and applying quickly to job postings can improve your chances.

All You Need to Know About Google Gemini 1.5 (Hint: It's More Important Than Sora)

The Algorithmic Bridge • 318 implied HN points • 20 Feb 24

🕹 Technology Machine Learning

Gemini 1.5 by Google introduces a Pro version with a 1-million-token context window, allowing for more detailed processing and potentially better performance.
Gemini 1.5 uses a multimodal sparse Mixture of Experts (MoE) architecture, similar to GPT-4, which can enhance performance while maintaining low latency.
The 1-10 million-token context window in Gemini 1.5 signifies a significant technical advancement in 2024, surpassing the importance of the OpenAI Sora release.