The hottest Machine Learning Substack posts right now

And their main takeaways

Teaching LLMs To Say “I don’t Know”

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 22 May 24

🕹 Technology Machine Learning

Large Language Models (LLMs) often make up answers when they don't know something, which can lead to inaccuracies. Instead, it's better for them to say 'I don’t know' when faced with unfamiliar topics.
LLMs can learn to give more accurate responses by being adjusted during training. They can be trained to recognize when they're unsure and respond cautiously instead of guessing.
Using reinforcement learning approaches can help reduce these incorrect guesses or 'hallucinations' by teaching models to express uncertainty and limit their responses to what they truly know.

A Short History Of LLMs & Conversational UIs

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 13 May 24

🕹 Technology Machine Learning

It's important to have a strong data plan when using AI because the technology is evolving quickly. Focusing on how to use data effectively can improve results.
Many businesses struggle with using large language models because they rely on external services. Having local versions could help, but technical challenges make this tough.
The use of AI in chatbot development has changed, starting from helping create better responses to managing conversations more smoothly, which makes interactions feel more natural.

Large Language Model (LLM) Stack — Version 6

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 10 May 24

🕹 Technology Machine Learning

Many people are interested in using smaller language models and hosting them on their own systems. This shows a trend toward more privacy and control.
New tools like GALE and LangSmith are helping people be more productive with these language models. They make it easier to use and manage AI tools.
Fine-tuning language models is becoming popular to improve how they work, not just to add new information. This helps models behave better and meet user needs.

LLMs Excel At In-Context Learning (ICL), But What About Long In-context Learning?

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 24 Apr 24

🕹 Technology Machine Learning

Long context handling remains a challenge for large language models (LLMs). They can struggle significantly when tasks become too complex or when relevant information is in the middle of the input.
LLMs perform better when key information is at the start or end of the input, but their accuracy drops when dealing with longer, more difficult tasks.
Using retrieval augmented generation (RAG) can help improve performance, but it's essential to manage context effectively to avoid the 'lost in the middle' issue.

Using LLMs For Autonomous Vehicles

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 23 Apr 24

🕹 Technology Machine Learning

Large Language Models (LLMs) can help autonomous vehicles predict if other cars will change lanes and explain those predictions clearly.
It's important for these predictions to be quick, ideally under 500 milliseconds, so cars can respond fast in traffic.
Integrating LLMs can improve trust in self-driving cars by making their decision-making process clearer and easier to understand.

Get a weekly roundup of the best Substack posts, by hacker news affinity:

Matching Retrieved Context With Question Context Using LogProbs With OpenAI for RAG

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 22 Apr 24

🕹 Technology Machine Learning

Logprobs help assess how confident a model is in its answers. This reduces incorrect or misleading answers.
When a question is asked, using logprobs can show if there’s enough information to answer it fully. This makes responses more reliable.
Understanding log probabilities turns complex tiny numbers into easier scales to work with. It helps in analyzing discussions and improving response quality.

Retrieval Augmented Fine-Tuning (RAFT)

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 28 Mar 24

🕹 Technology Machine Learning

RAFT helps language models focus on useful documents while answering questions and ignore irrelevant ones. This means the model can provide more accurate and relevant responses.
RAFT combines the benefits of supervised fine-tuning with retrieval-augmented generation. This allows the model to learn from both specific documents and broader patterns in data.
The way data is prepared for training in RAFT is really important. It ensures that each training example has a question, related documents, and a clear answer.

Agentic RAG: Context-Augmented OpenAI Agents

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 14 Mar 24

🕹 Technology Machine Learning

Agentic RAG combines OpenAI's function calling with autonomous agents for better task management. This makes it easier to choose the right tools for different tasks.
LlamaIndex's ContextRetrieverOpenAIAgent allows you to use multiple tools while keeping the process straightforward. It helps manage complexity by organizing various functions effectively.
This new approach allows for more detailed queries and better analysis of data. It lets users run complex calculations while ensuring the results can be easily understood.

RAT — Retrieval Augmented Thoughts

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 13 Mar 24

🕹 Technology Machine Learning

RAT combines two methods: Chain-of-Thought (CoT) prompting and retrieval augmented generation (RAG). It helps improve complex reasoning tasks by revising thoughts step-by-step.
Finding a balance between efficiency and accuracy is important when using AI tools. Too many checks can slow down the process, but having high accuracy is crucial for user satisfaction.
Using RAT shows better performance in tasks like coding and creative writing compared to other methods. This approach helps avoid mistakes and ensures more accurate responses.

Large Language Models Excel At In-Context Learning (ICL)

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 06 Mar 24

🕹 Technology Machine Learning

Large Language Models (LLMs) can learn better when given contextual information, which helps them be more accurate and reduce mistakes.
Retrieval-augmented generation (RAG) is a useful method because it allows models to customize responses without needing a lot of extra training.
Even with good context, LLMs can still create some incorrect responses, showing that they sometimes mix up information in a believable way.

RAG, Data Privacy, Attack Methods & Safe-Prompts

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 05 Mar 24

🕹 Technology Machine Learning

RAG helps protect sensitive data by making it harder for attackers to retrieve private information from training datasets. This provides better privacy for the users.
Creating safe prompts is essential. These prompts can guide the AI to avoid generating or exposing sensitive information effectively.
RAG systems can reduce the risk of revealing private data by changing how LLMs remember and retrieve information, which is a safer approach than using LLMs alone.

Time-Aware Adaptive RAG (TA-ARE)

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 01 Mar 24

🕹 Technology Machine Learning

Time-Aware Adaptive RAG (TA-ARE) helps decide when it's necessary to retrieve extra information for answering questions, making the process more efficient.
Adaptive retrieval is better than standard methods because it only retrieves information when needed, reducing unnecessary costs in using resources.
The study suggests that understanding the timing of questions can improve how large language models respond, making them more capable without needing extra training.

Develop Generative Apps Locally

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 29 Feb 24

🕹 Technology Machine Learning

You can create generative apps that run completely on your own computer. This makes development easier and often faster.
Using tools like HuggingFace and TitanML's TakeOff Server, you can access and manage small language models without needing an internet connection.
Running inference locally improves speed, keeps your data private, and lets you work offline when needed.

LLM Drift, Prompt Drift & Cascading

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 23 Feb 24

🕹 Technology Machine Learning

LLM Drift means that a language model's responses can change a lot over time. It's important to keep an eye on how these models perform since they might get worse unexpectedly.
Prompt Drift occurs when the same input doesn't give the same result over time due to changes in the model or data. This can cause differences in what users expect and what they actually get.
Cascading happens when one mistake in a chain of tasks leads to more problems in subsequent tasks. Once one part has an error, it can make everything else after it worse.

Fine-Tuning or RAG?

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 21 Feb 24

🕹 Technology Machine Learning

Choosing between fine-tuning and RAG depends on costs, available data, and model performance. It's important to weigh the benefits against the money and effort needed.
RAG is often preferred because it provides context for questions and is easier to maintain. Fine-tuning can sometimes hurt the model due to forgetting past information.
While both approaches have strengths, RAG often outperforms fine-tuning by including relevant knowledge and context. Experimenting with different models can lead to better results.

Leveraging LLM In-Context Learning Abilities

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 20 Feb 24

🕹 Technology Machine Learning

Large Language Models (LLMs) learn best when given specific context in their prompts. They use this context to generate accurate answers instead of relying solely on what they were previously trained on.
Response time is very important when using LLMs, especially for conversational applications. Hosting LLMs locally can help reduce delays and save on costs.
The process of breaking down complex questions into smaller ones can lead to better answers. This involves organizing thoughts and evaluating the quality of the information used to answer the questions.

Beyond Chain-of-Thought LLM Reasoning

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 12 Feb 24

🕹 Technology Machine Learning

Indirect reasoning helps solve problems where direct reasoning fails. It uses logic to make connections that LLMs might struggle with.
This approach significantly improves accuracy in tasks like factual reasoning and mathematical proofs. It shows better performance compared to methods that rely only on direct reasoning.
The study suggests using simple prompts to guide LLMs in applying indirect reasoning, making it easier and more effective without needing complex frameworks.

Seven RAG Engineering Failure Points

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 06 Feb 24

🕹 Technology Machine Learning

Retrieval-Augmented Generation (RAG) reduces errors in information by combining data retrieval with language models. This helps produce more accurate and relevant responses.
RAG allows for better organization of data, making it easy to include specific industry-related information. This is important for tailoring responses to user needs.
There are several potential failure points in RAG, such as missing context or providing incomplete answers. It's crucial to design systems that can handle these issues effectively.

LLamaIndex Agentic RAG Demo

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 01 Feb 24

🕹 Technology Machine Learning

Agentic RAG uses a system of smaller agents to answer questions across multiple documents. Each smaller agent focuses on its own document, which helps organize the information better.
This setup allows for comparing different documents and summarizing specific ones easily. It's a flexible way to dig into complex topics.
The architecture is designed to scale by adding more agents as needed. This means it can grow and adapt to handle more information over time.

Chain-of-Symbol Prompting (CoS) For Large Language Models

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 29 Jan 24

🕹 Technology Machine Learning

LLMs struggle with understanding complex spatial tasks using just natural language. This research focuses on improving their ability to navigate virtual environments.
The new Chain-of-Symbol Prompting (CoS) method helps LLMs represent spatial relationships more effectively. It leads to much better performance in planning tasks compared to traditional methods.
Using symbols instead of natural language makes it easier for LLMs to learn and reduces the number of tokens needed in prompts. This results in clearer and more concise representations.

Prompt-RAG: Vector Embedding Free Retrieval-Augmented Generation

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 26 Jan 24

🕹 Technology Machine Learning

Prompt-RAG is a simpler way to use language models without needing complex data setups like vector embeddings. This makes it easier to apply for specific tasks.
It uses a Table of Contents to find the right information quickly, which helps generate more accurate responses to user questions.
While it's great for small projects, it may face challenges with larger data or technical scaling as needs grow.

Bulk Data Discovery

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 25 Jan 24

🕹 Technology Machine Learning

Data discovery is crucial for understanding unstructured data. It helps find user intent and classifies interactions effectively.
Using embeddings allows us to visualize data by grouping similar meanings. This helps spot patterns and outliers in conversations.
Data preparation involves identifying, collecting, and analyzing data. This step helps reveal valuable insights that support decision-making.

Meta Taxonomy Of Large Language Model Correction & Refinement

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 17 Jan 24

🕹 Technology Machine Learning

Researchers are developing different methods to improve the output of large language models (LLMs). This includes techniques like self-correction and feedback from both humans and models.
There are two main approaches when using LLMs: one relies heavily on the model itself, while the other uses external frameworks and human input to enhance accuracy.
Challenges with LLMs, like generating false or harmful content, can be addressed through careful correction strategies that can happen during or after the model's output is generated.

Considering Large Language Model Reasoning Step Length

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 16 Jan 24

🕹 Technology Machine Learning

Longer reasoning steps can really help large language models do better, even if they don't add new info. It's like taking your time to think things through.
For simpler tasks, fewer steps are better, but complex tasks can get a boost from having more detailed reasoning. It's all about matching the task with the right amount of thinking.
Even if the reasoning isn't completely correct, as long as it's long enough, it can still lead to good results. Sometimes the process matters more than being right.

Chain Of Natural Language Inference (CoNLI)

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 12 Jan 24

🕹 Technology Machine Learning

There are three types of hallucinations in AI-generated text: context-free, ungrounded, and self-conflicting. Each type means there's a different way the text can be misleading.
The CoNLI framework helps detect and reduce hallucinations in text responses. It can rewrite responses to improve their accuracy without needing special tuning.
CoNLI works even when the user has limited control over the AI model, making it easier to ensure that the generated output aligns with correct information.

Large Language Model Hallucination Mitigation Techniques

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 10 Jan 24

🕹 Technology Machine Learning

There are many techniques to prevent hallucinations in large language models. They can be grouped into two types: methods that adjust the model itself and those that change how you ask it questions.
Some effective techniques include using retrieval-augmented generation and prompting the model carefully. This means providing clear context and expected outcomes before asking for information.
To best reduce hallucinations, combining different strategies is key. No single method works perfectly, so using a mix of approaches helps improve the model's accuracy and reliability.

Random Chain-Of-Thought For LLMs & Distilling Self-Evaluation Capability

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 08 Jan 24

🕹 Technology Machine Learning

Complexity in processing data for large language models (LLMs) is growing. Breaking tasks into smaller parts is becoming a standard practice.
LLMs are now handling tasks that used to require human supervision, such as generating explanations or synthetic data.
Providing detailed context during inference is crucial to avoid mistakes and ensure better responses from LLMs.

Teaching LLMs To Say, “I don’t know”

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 04 Jan 24

🕹 Technology Machine Learning

Large Language Models (LLMs) often give answers even when they don't know, which can lead to incorrect information. It's important for them to learn to say 'I don't know' instead.
A new method called R-Tuning can help LLMs understand their limits by recognizing when they don't have enough information. This approach improves their ability to refuse answering unknowable questions.
By identifying gaps in their knowledge, LLMs can be trained better to avoid giving false answers, making them more reliable and accurate in conversation.

LLM Performance Over Time & Task Contamination

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 02 Jan 24

🕹 Technology Machine Learning

LLMs do better on tasks related to older data compared to newer data. This means they might struggle with recent information.
Training data can affect how well LLMs perform in certain tasks. If they have seen examples before, they can do better than if it's completely new.
Task contamination can create a false impression of an LLM's abilities. It can seem like they are good at new tasks, but they might have already learned similar ones during training.

LLM-Generated Self-Explanations

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 21 Dec 23

🕹 Technology Machine Learning

LLMs can make predictions and explain how they arrived at those predictions. This helps in understanding their reasoning better.
Using a 'Chain of Thoughts' method can improve LLMs' ability to solve complex tasks, especially in areas like math and sentiment analysis.
There's a need for better ways to evaluate the explanations given by LLMs because current methods may not accurately determine which explanations are effective.

What Is Multi-Task Language Understanding or MMLU?

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 19 Dec 23

🕹 Technology Machine Learning

Multi-Task Language Understanding (MMLU) measures how well language models perform on various subjects. It uses a huge set of multiple-choice questions to test their knowledge.
Though some language models like GPT-3 show improvement over random guessing, they still struggle with complex topics like ethics and law. They often don't recognize when they're wrong.
Model confidence isn't a good indicator of accuracy. For example, GPT-3 can be very confident in its answers, but still be far from correct.

Intelligent & Programable Prompt Pipelines From Haystack

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 18 Dec 23

🕹 Technology Machine Learning

Prompt pipelines help connect different prompts in a simpler way than using complex autonomous agents. This means making sure that data flows smoothly when using tools powered by AI.
While using JSON for output is helpful, there are challenges in maintaining a consistent structure. This can make it tricky to handle the data as it changes.
The Haystack framework offers a way to bridge basic prompts and more complex systems. It shows how to manage user input and AI output for better interactions.

A Comprehensive Survey of Large Language Models (LLMs)

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 13 Dec 23

🕹 Technology Machine Learning

The number of research papers on large language models (LLMs) has surged significantly, rising from about one per day to nearly nine since 2019. This shows a growing interest in understanding these models.
Three important skills of LLMs are in-context learning, following instructions, and step-by-step reasoning. These abilities help models perform better on various tasks.
Open-source LLMs, like Meta's LLaMA, have made it easier for researchers to customize and grow these models, leading to more innovation in the field.

Five Stages Of LLM Implementation

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 11 Dec 23

🕹 Technology Machine Learning

Implementing LLMs (Large Language Models) changes how applications are developed. Many teams focus on building tools instead of actually using them, which creates a gap.
Getting data right is vital for successful LLM implementation. Companies should look closely at their data strategy to ensure LLMs perform well, especially during real-time use.
There are several stages to using LLMs effectively. Starting from design time benefits user experience by avoiding issues like high costs and slow responses when deployed.

OpenAI Announced 28 Models To Be Switched Off

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 07 Dec 23

🕹 Technology Machine Learning

OpenAI is shutting down 28 of its language models, and users need to switch to new models before the deadline. It's important for developers to find alternative models or consider self-hosting their solutions.
Cost is a big issue with using language models; it’s usually more expensive to generate responses than to provide input. Users must monitor their token usage carefully to manage expenses.
LLM Drift is a real concern, as responses from language models can change significantly over time. Continuous monitoring is needed to ensure accuracy and performance remain stable.

Gemini From Google

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 07 Dec 23

🕹 Technology Machine Learning

Google's Gemini is a powerful AI that can understand and work with text, images, video, audio, and code all at once. This makes it really versatile and capable of handling different types of information.
Starting December 6, 2023, Google's Bard will use a version of Gemini Pro for better reasoning and understanding. This means Bard will soon be smarter and more helpful in answering questions.
Gemini has shown it can outperform human experts in language tasks. This is a significant achievement, indicating that AI is getting very close to human-like understanding in complex subjects.

Data Delivery To Large Language Models [Updated]

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 06 Dec 23

🕹 Technology Machine Learning

Every effective AI strategy needs a solid data strategy that includes data discovery, design, development, and delivery.
At inference, providing the right context and relevant data is crucial to help language models produce accurate responses.
Training models involves two key phases: meta-training for foundational knowledge and meta-learning for fine-tuning on specific tasks.

As-Needed Decomposition & Planning Using Large Language Models — (ADaPT)

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 05 Dec 23

🕹 Technology Machine Learning

ADaPT is a method that breaks down complex tasks into smaller steps only when needed. This helps manage complicated tasks better.
This approach uses a planner to come up with a big plan and then hands off simpler steps to another model for execution. This makes the process smoother.
ADaPT adds resilience and smart logic to using language models, allowing them to handle tasks that get tricky and require adjustments along the way.

Self-Consistency For Chain-Of-Thought Prompting

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 04 Dec 23

🕹 Technology Machine Learning

Self-consistency prompting helps improve the accuracy of language models when solving reasoning problems. It does this by generating different reasoning paths and choosing the most consistent answer.
Using self-consistency can lead to better performance in various tasks, including arithmetic and common-sense reasoning. It shows clear accuracy gains across multiple language models.
This approach requires careful sampling and processing of the reasoning paths to get the best final answer. It's all about making sense of the various responses to reach a clear conclusion.

ChatGPT Is One-Year Old: Are Open-Source Large Language Models Catching Up?

Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 01 Dec 23

🕹 Technology Machine Learning

Some open-source language models are doing better than ChatGPT in specific tasks, showing that they are improving quickly. For example, models like Lemur-70B-chat are better at certain coding tasks.
The study highlights that while open-source models are catching up, GPT models like ChatGPT still excel in areas like AI safety, making them important for commercial use.
Understanding the differences between raw LLMs, LLM APIs, and user interfaces is crucial, as people often mix these terms up in discussions about AI technology.