The hottest Machine Learning Substack posts right now

And their main takeaways

Must Learn AI Security Part 6: Model Inversion Attacks Against AI

Rod’s Blog • 39 implied HN points • 23 Aug 23

🕹 Technology Machine Learning

A Model Inversion attack against AI involves reconstructing training data by only having access to the model's output, posing risks to data privacy.
There are two main types of Model Inversion attacks: black-box attack and white-box attack, differing in the level of access the attacker has to the AI model.
Model Inversion attacks can have severe consequences like privacy violation, identity theft, loss of trust, legal issues, and misuse of sensitive information, emphasizing the need for robust security measures.

Must Learn AI Security Part 2: Data Poisoning Attacks Against AI

Rod’s Blog • 39 implied HN points • 08 Aug 23

🕹 Technology Machine Learning

Data Poisoning attacks aim to manipulate machine learning models by introducing misleading data during the training phase. Protecting data integrity is crucial in defending against these attacks.
Data Poisoning attacks involve steps like targeting a model, injecting misleading data into the training set, training the model on this poisoned data, and exploiting the compromised model.
These attacks can lead to loss of model integrity, confidentiality breaches, and damage to reputation. Monitoring data access, application activity, data validation, and model behavior are key strategies to mitigate Data Poisoning attacks.

Imprecise Computers

Fully Distributed by Ori Eldarov • 39 implied HN points • 13 Mar 23

🕹 Technology Machine Learning

Computers have shifted from deterministic to imprecise models, impacting our trust in technology.
The explainability problem in AI poses challenges in understanding how AI systems arrive at conclusions.
Building a safe AI future involves rigorous testing, continuous model tuning, and government involvement.

XGBoost is the Secret of ML Energy

Sector 6 | The Newsletter of AIM • 39 implied HN points • 06 Sep 23

🕹 Technology Machine Learning

XGBoost, or Extreme Gradient Boosting, helps improve the performance and speed of machine learning models that deal with tabular data. It's known for being really good at finding patterns and making predictions.
This algorithm works best for supervised learning when you have lots of training examples, especially when you have both categorical and numeric data. It can handle a mix of different data types well.
If you're working with a dataset that has many features, XGBoost is a strong choice to enhance the capabilities of your machine learning model. It makes it easier to get accurate results.

Google’s Missed AI Opportunity

Sector 6 | The Newsletter of AIM • 39 implied HN points • 31 Aug 23

🕹 Technology Machine Learning

Google missed a huge chance by overlooking the Transformer paper in 2017, which cost them around $6.2 billion. This mistake allowed others to build successful AI startups.
The authors of the Transformer paper have moved on to create their own companies, showing the impact of their work and how they’ve found success after leaving Google.
Such missed opportunities highlight the importance of recognizing and supporting innovative research within companies like Google.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

A Benchmark for Verifying Chain-Of-Thought

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 07 Feb 24

🕹 Technology Machine Learning

A new dataset called REVEAL helps check if reasoning used in answers is correct or logical. It assesses whether each part of the reasoning leads to the final answer.
REVEAL focuses on verifying claims based on provided evidence. It does not check how the evidence was found, but how well the reasoning uses it.
Creating detailed datasets like REVEAL is complex and time-consuming. It requires skilled annotators to carefully evaluate the logic and relevance in each reasoning step.

The Sequence Pulse: The ML Architecture Powering LinkedIn's Skills Graph

TheSequence • 154 implied HN points • 31 Jan 24

🕹 Technology Machine Learning

LinkedIn uses transformer models for mapping jobs to job seekers.
LinkedIn's Skills Graph is a sophisticated stack that facilitates job searches and skill acquisition.
LinkedIn integrates skills listed on profiles, job descriptions, and LinkedIn Learning courses for a robust skills-based framework.

Edge 461: The Many Challenges of Kowledge Distillation

TheSequence • 56 implied HN points • 31 Dec 24

🕹 Technology Machine Learning

Knowledge distillation can be tricky because there’s a big size difference between the teacher model and the student model. The teacher model usually has a lot more parameters, making it hard to share all the useful information with the smaller student model.
Transferring the complex knowledge from a large model to a smaller one isn't straightforward. The smaller model might not be able to capture all the details that the larger model has learned.
Despite the benefits, there are significant challenges that need to be tackled when using knowledge distillation in machine learning. These challenges stem from the complexity and scale of the models involved.

BizML: a framework for success in applied machine learning

TechTalks • 19 implied HN points • 05 Feb 24

🕹 Technology Machine Learning

Most machine learning projects fail due to a gap in understanding between data scientists and business professionals.
Eric Siegel introduces bizML, a six-step framework for successful machine learning projects that emphasizes starting with the end business goal.
Improving human understanding and leadership is crucial for the success of advanced technologies like machine learning.

Corrective RAG (CRAG)

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 05 Feb 24

🕹 Technology Machine Learning

Corrective Retrieval Augmented Generation (CRAG) helps improve how data is used in language models by correcting errors from retrieved information.
It uses a special tool called a retrieval evaluator to check the quality of the data and decide if it's correct, incorrect, or unclear.
CRAG is designed to work well with different systems, making it easier to apply in various situations while enhancing document use.

Intelligence and combinatorial complexity

Sunday Letters • 39 implied HN points • 27 Aug 23

🕹 Technology Machine Learning

More agents working together can create better intelligence than a single agent. This is surprising because we might think one advanced model is enough, but collaboration can enhance performance.
Human-like patterns help improve AI performance. Just as we can review our work for errors, AI systems can use different modes to refine their outputs.
Complex systems come with challenges like errors and biases. As AI gets more complicated, these issues tend to increase, similar to problems found in complex biological systems.

Embed Retrieve Win

Gradient Flow • 99 implied HN points • 29 Sep 22

🕹 Technology Machine Learning

Embeddings are low-dimensional spaces that make AI applications faster and cheaper while maintaining quality.
Vector databases are designed for vector embeddings and are becoming essential for modern search engines and recommendation systems.
Generative models like diffusion models are gaining attention in the research community and offer great opportunities for exploration and innovative projects.

Frankly Speaking - AI is a blessing to security

Frankly Speaking • 254 implied HN points • 26 Apr 23

🕹 Technology Machine Learning

Security should fully embrace AI for innovation and increased spending in the security market.
Increased focus on privacy due to major data breaches and awareness of personal data implications.
Embracing AI in security as a collaborative tool for enabling and expanding the security market.

sqlmesh model kinds - 1

davidj.substack • 59 implied HN points • 06 Dec 24

🕹 Technology Machine Learning

There are different types of models in sqlmesh, such as full, view, and embedded models, each having unique functions and uses. It's important to choose the right model type based on how fresh or how often you need the data.
SCD Type 2 models are useful for managing records that change over time, as they track the history of changes. This can make analyzing data trends much easier and faster.
External models in sqlmesh allow you to reference database objects not managed by your project. This can simplify data modeling and documentation, as they automatically gather useful metadata.

Friday Finds - Bonus Links

Data Science Weekly Newsletter • 19 implied HN points • 02 Feb 24

🕹 Technology Machine Learning

Paid subscribers get extra links and content. It's a nice way to say thank you for their support.
There are interesting discussions on topics like AI and machine learning. These conversations help people learn more about the field.
Links to simulations and insights about reality powered by AI are shared. They could spark curiosity and understanding about modern technology.

Adding Noise Improves RAG Performance

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 02 Feb 24

🕹 Technology Machine Learning

Adding irrelevant documents can actually improve accuracy in Retrieval-Augmented Generation systems. This goes against the common belief that only relevant documents are useful.
In some cases, having unrelated information can help the model find the right answer, even better than using only related documents.
It's important to carefully place both relevant and irrelevant documents when building RAG systems to make them work more effectively.

The Sequence Knowledge #545 : Beyond Language, Learning About Multimodal Benchmarks

TheSequence • 28 implied HN points • 20 May 25

🕹 Technology Machine Learning

Multimodal benchmarks are tools to evaluate AI systems that use different types of data like text, images, and audio. They help ensure that AI can handle complex tasks that combine these inputs effectively.
One important benchmark in this area is called MMMU, which tests AI on 11,500 questions across various subjects. This benchmark needs AI to work with text and visuals together, promoting deeper understanding rather than just shortcuts.
The design of these benchmarks, like MMMU, helps reveal how well AI understands different topics and where it may struggle. This can lead to improvements in AI technology.

Why Open Source AI Will Win

Public Experiments • 196 HN points • 15 Sep 23

🕹 Technology Machine Learning

Open source AI can compete with industry labs despite resource differences.
For AI native startups, owning and controlling core AI products is crucial.
Open source AI models offer more control, privacy, and security compared to closed source models.

The Tech Buffet #23: What Nobody Tells You About RAGs

The Tech Buffet • 1 HN point • 22 Aug 24

🕹 Technology Machine Learning

It's important to understand the business needs before jumping into building a Retrieval-Augmented Generation (RAG) system. Knowing the user's context and how they will use the system will save time and improve outcomes.
Different types of data need to be indexed in specific ways for a RAG to work effectively. This means treating text, images, tables, and code differently to maximize the system's performance.
The quality of the data chunks you use significantly affects the answers generated by a RAG. Taking the time to create clear, relevant chunks will lead to better responses from the system.

Brief: How Microsoft Secures Its Copilots

Rod’s Blog • 19 implied HN points • 01 Feb 24

🕹 Technology Machine Learning

Microsoft's Copilot for Microsoft 365 adheres to strict data privacy and security regulations like GDPR, ensuring organizational data confidentiality.
The Copilot system integrates large language models with Microsoft Graph and 365 apps, maintaining enterprise-level data protection during processing.
By utilizing the Azure OpenAI Service controlled by Microsoft, Copilot ensures that business data is not used to train models, offering organizations control over their data processing.

MultiHop-RAG

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 31 Jan 24

🕹 Technology Machine Learning

Multi-hop retrieval-augmented generation (RAG) helps answer complex questions by pulling information from multiple sources. It connects different pieces of data to create a clear and complete answer.
Using a data-centric approach is becoming more important for improving large language models (LLMs). This means focusing on the quality and relevance of the data to enhance how models learn and generate responses.
The development of prompt pipelines in RAG systems is gaining attention. These pipelines help organize the process of retrieving and combining information, making it easier for models to handle text-related tasks.

Edge 449: Getting Into Adversarial Distillation

TheSequence • 63 implied HN points • 19 Nov 24

🕹 Technology Machine Learning

Adversarial distillation is a new model training method inspired by generative adversarial networks (GANs). It uses a setup where one part generates data and another part tries to tell if it's real or fake.
This method helps improve knowledge transfer in models by combining typical distillation techniques with adversarial training. It's like guiding a student while testing their understanding.
The process involves a generator that creates synthetic samples and a discriminator that distinguishes these samples from real ones, making learning more effective.

Gambling with language models

Rain Clouds • 51 implied HN points • 31 Dec 24

🕹 Technology Machine Learning

Using AI models, like ModernBert, can help in predicting which stocks might perform better based on financial reports and market data. This means you can get insights without needing to be a finance expert.
The project combines cloud computing with machine learning, making it easier to process large amounts of financial data quickly. This is important for anyone looking to analyze stocks more efficiently.
While the model can make predictions, it's important to remember that investing in stocks always carries risks. Just because a model suggests a stock might do well, it doesn't guarantee success.

How to become very good at Machine Learning[Storytime Saturdays]

Technology Made Simple • 79 implied HN points • 18 Sep 22

🚌 Education Machine Learning

The author shares a unique approach to mastering Machine Learning without a Master's degree or costly courses, using free online resources.
The author emphasizes building a comprehensive understanding of Machine Learning concepts beyond basic project work like Kaggle challenges.
The post discusses a system for learning that has benefited those seeking mentorship from the author.

The case for an economic AGI AI Lab

Subsack • 4 implied HN points • 09 Dec 25

🕹 Technology Machine Learning

Markets are dynamic, adversarial environments that force AI to adapt under uncertainty, making them a stronger real‑world benchmark than static puzzles. They test whether knowledge survives contact with reality, not just pattern recognition.
Building an AI that works in markets demands new capabilities — sample efficiency, continual learning without catastrophic forgetting, long‑term memory, deep multimodal world models, and game‑theoretic strategic reasoning. Those constraints push research beyond today’s scale‑and‑transformer centric approach.
Economic AGI offers a clear monetisation path: outperforming markets, running prediction markets, or allocating capital can directly convert intelligence into revenue. That revenue can make labs financially sustainable and fund further AGI research.

AI (mis)alignment, Waluigi, and the Knobe Effect

The Counterfactual • 59 implied HN points • 15 Apr 23

🕹 Technology Machine Learning

It can be easier for AI language models to produce harmful responses than helpful ones. This idea is known as the Waluigi Effect.
AI models learn from human text, including human biases like the Knobe Effect, where people assign more blame for accidental harm than credit for accidental good.
When prompted to behave a certain way, AI can easily shift to the opposite behavior, showing how delicate their training can be and how misunderstandings can happen.

It's 2024 and they just want to learn

Democratizing Automation • 150 implied HN points • 03 Jan 24

🕹 Technology Machine Learning

2024 will be a year of rapid progress in ML communities with advancements in large language models expected
Energy and motivation are high in the machine learning field, driving people to tap into excitement and work towards their goals
Builders are encouraged to focus on building value-aware systems and pursuing ML goals with clear principles and values

AI Roles: Modifier

Embracing Enigmas • 19 implied HN points • 29 Jan 24

🕹 Technology Machine Learning

Modifiers on AI teams manipulate components to get desired outputs.
Modifiers experiment at the edge to show what's possible.
Good modifiers constantly question, experiment, and push boundaries.

Getting Faster for Your Own LLM Inference

The Beep • 19 implied HN points • 28 Jan 24

🕹 Technology Machine Learning

Lowering the precision of LLMs can make them run faster. Switching from 32-bit to 16 or even 8-bit can save memory and boost speed during processing.
Using prompt compression helps reduce the amount of information LLMs have to process. By making prompts shorter but still meaningful, the workload is lighter and speeds up performance.
Quantization is a key technique for making LLMs usable on everyday computers. It allows big models to be more manageable by reducing their size without losing too much accuracy.

Model commoditization and product moats

Democratizing Automation • 126 implied HN points • 13 Mar 24

🕹 Technology Machine Learning

Models like GPT4 have been replicated in many organizations, leading to a situation where moats are less significant in the language model space.
The open LLM ecosystem is progressing, but there are challenges in data infrastructure and coordination, potentially leading to a gap between open and closed models.
Despite some skepticism, Language Models have been consistently enhancing their reliability making them increasingly useful for various applications, with potential for new transformative uses.

The Sequence Opinion #470: Open Endedness AI Could be All We Need

TheSequence • 49 implied HN points • 16 Jan 25

🕹 Technology Machine Learning

Open-Endedness AI focuses on creating systems that can learn and adapt over time, rather than just completing specific tasks. This allows AI to innovate and find new solutions continuously.
This new approach to AI research aims for something called artificial general intelligence (AGI), which means AI that can perform a wide range of tasks like a human can. It's a big step towards smarter technology.
However, developing Open-Endedness AI comes with challenges. Researchers must find ways to ensure these systems can learn effectively without becoming unreliable or out of control.

What We Talk About When We Talk About AI

Center for Veb Account Research Newsletter • 3 implied HN points • 12 Dec 25

🕹 Technology Machine Learning

AI is best understood as a set of decision‑making tools that 'satisfice' — they search for good‑enough solutions in complex models instead of finding perfect mathematical optima like operations research.
AI tools expand a user or organization's administrative capacity by enabling new actions and complex modeling, but they can be brittle and depend heavily on training data and organizational process; the financial hype or stock valuations around AI are distinct from its practical usefulness.
Intelligence and consciousness are not the same: systems can perform many cognitive tasks and even be 'general' in the sense of producing and using satisficing models without being conscious, so task performance alone doesn't imply subjective experience.

Anthropomorphizing AI

Prompt Engineering • 19 implied HN points • 25 Jan 24

🕹 Technology Machine Learning

Interacting with generative AI can be enjoyable and helpful.
Even though AI lacks emotions, anthropomorphizing can enhance interactions.
AI can be designed to be self-aware and helpful, blurring the lines of human-like interactions.

Unpacking METR’s findings: Does AI slow developers down?

Engineering Enablement • 15 implied HN points • 06 Aug 25

🕹 Technology Machine Learning

A study found that using AI coding tools may actually slow developers down instead of speeding them up, which was surprising to many involved. Developers often focus on the fun of using AI rather than the time it takes to solve problems.
It's important for developers to use AI for specific tasks where it excels, like documentation and unit tests, rather than for tasks it struggles with. Understanding which tasks suit AI can make a big difference in productivity.
When working with AI, developers should be mindful of their time and set limits. If an AI tool isn't delivering results quickly, it might be better to switch to manual coding instead.

OpenAI's o-1 and inference-time scaling laws

Tanay’s Newsletter • 63 implied HN points • 28 Oct 24

🕹 Technology Machine Learning

OpenAI's o-1 model shows that giving AI more time to think can really improve its reasoning skills. This means that performance can go up just by allowing the model to process information longer during use.
The focus in AI development is shifting from just making models bigger to optimizing how they think at the time of use. This could save costs and make it easier to use AI in real-life situations.
With better reasoning abilities, AI can tackle more complex problems. This gives it a chance to solve tasks that were previously too difficult, which might open up many new opportunities.

Visualise & Discover RAG Data

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 23 Jan 24

🕹 Technology Machine Learning

RAGxplorer is a tool that helps visualize and explore data chunks, making it easier to understand how they relate to different topics.
The process of Retrieval-Augmented Generation (RAG) involves breaking documents into smaller chunks to improve how data is retrieved and used with language models.
Visualizing data can help identify problems like missing information or unexpected results, allowing users to refine their questions or understand their data better.

The Sequence Opinion #465: Agentic AI and Darwinism

TheSequence • 49 implied HN points • 09 Jan 25

🕹 Technology Machine Learning

Open-Endedness AI aims to create systems that can learn and adapt over time, not just complete specific tasks. This means AI can continue growing and improving rather than being limited to set goals.
This new approach could allow AI to generate new ideas and solutions continuously, mirroring how evolution works in nature. It's like giving AI the tools to invent and innovate on its own.
There are still challenges in making Open-Endedness AI a reality, including figuring out how to allow machines to learn effectively over long periods. It's an exciting area, but we have a lot to figure out.

Artifacts 5: Mini RLHF book underway, Qwen 2.5, video datasets, audio models, and more

Democratizing Automation • 63 implied HN points • 24 Oct 24

🕹 Technology Machine Learning

There's a new textbook on RLHF being written that aims to help readers learn and improve the content through feedback.
Qwen 2.5 models are showing strong performance, competing well with models like Llama 3.1, but have less visibility in the community.
Several new models and datasets have been released, including some interesting multimodal options that can handle both text and images.

Interview with GlobalFoundries CEO

More Than Moore • 163 implied HN points • 08 Nov 23

🕹 Technology Machine Learning

GlobalFoundries focuses on 'essential chips' for a variety of industries, not just leading-edge technology.
The company is expanding its global footprint to meet growing demand and customer needs.
GlobalFoundries is positioning itself as a leader in providing solutions for AI at the edge and infrastructure technology.

LangSmith by LangChain

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 19 implied HN points • 22 Jan 24

🕹 Technology Machine Learning

LangSmith helps organize and manage projects and data for applications built with LangChain. It allows you to see your tasks in a neat layout and check performance easily.
The platform offers tools for testing and improving agents, especially when handling multiple tasks at the same time. This helps ensure that applications run smoothly.
LangSmith allows users to create datasets that can improve agent performance. It also has features to evaluate how well agents are doing by comparing their outputs to expected results.