The Counterfactual

The Counterfactual explores cognitive science, AI, and statistics through topics like large language models, their cognitive capabilities, tokenization, theory of mind, human irrationality, language understanding, and the impact of AI on culture and communication. It discusses methods for evaluating linguistic and statistical claims and the broader cognitive implications of AI technologies.

Cognitive Science Artificial Intelligence Language Models Statistics Human Cognition Language Understanding AI and Society Ethics in AI Cognitive Diversity

The hottest Substack posts of The Counterfactual

And their main takeaways
99 implied HN points 02 Aug 24
  1. Language models are trained on specific types of language, known as varieties. This includes different dialects, registers, and periods of language use.
  2. Using a representative training data set is crucial for language models. If the training data isn't diverse, the model can perform poorly for certain groups or languages.
  3. It's important for researchers to clearly specify which language and variety their models are based on. This helps everyone better understand what the model can do and where it might struggle.
199 implied HN points 27 Jun 24
  1. Always look at the whole distribution of data, not just the average. The average can be affected by extreme values, so it's crucial to see the bigger picture to understand what the data really tells us.
  2. Consider the baseline or reference point when evaluating numbers. Knowing how a number compares to others helps us understand if it's large or small, which gives us better context.
  3. Understand the story behind the data-generating process. This means recognizing the factors that led to the results we see, which helps in identifying possible biases or alternative explanations.
119 implied HN points 19 Jul 24
  1. Cuisines can be recognized by their unique ingredients, which usually make up their core flavors. For example, Southern Italian cuisine often has tomatoes and olive oil, while Chinese dishes might use soy sauce and ginger.
  2. Research shows that some ingredients are more common in certain cuisines than others. This means some ingredients are 'distinctive' and can help identify the style of a dish or cuisine.
  3. Different cuisines have varying trends when it comes to combining flavors. Some might use ingredients with similar tastes together, while others may avoid them, highlighting unique culinary preferences.
239 implied HN points 02 May 24
  1. Tokens are the building blocks that language models use to understand and predict text. They can be whole words or parts of words, depending on how the model is set up.
  2. Subword tokenization helps models balance flexibility and understanding by breaking down words into smaller parts, so they can still work with unknown words.
  3. Understanding how tokenization works is key to improving the performance of language models, especially since different languages have different structures and complexity.
599 implied HN points 28 Jul 23
  1. Large language models, like ChatGPT, work by predicting the next word based on patterns they learn from tons of text. They don’t just use letters like we do; they convert words into numbers to understand their meanings better.
  2. These models handle the many meanings of words by changing their representation based on context. This means that the same word could have different meanings depending on how it's used in a sentence.
  3. The training of these models does not require labeled data. Instead, they learn by guessing the next word in a sentence and adjusting their processes based on whether they are right or wrong, which helps them improve over time.
Get a weekly roundup of the best Substack posts, by hacker news affinity:
79 implied HN points 10 Jun 24
  1. Language can change based on what we read and hear, including the influence of AI like ChatGPT. If more people use certain words from LLMs, those words might become more popular in everyday conversation.
  2. Technology, especially intelligent machines, can shape our culture by creating new ideas and behaviors. This includes changing the way we communicate and even how we think.
  3. The impact of machines on culture could lead to two different futures: one where everything becomes more similar (homogenization), and another where many unique cultures and languages emerge (diversification). Both possibilities pose interesting challenges for our future.
139 implied HN points 17 Apr 24
  1. A new class on Large Language Models (LLMs) was created to help Cognitive Science students understand the intersection of AI and human cognition, especially after the popularity of technologies like ChatGPT.
  2. The course covered the history and technical foundations of LLMs, with hands-on labs and discussions that helped students think critically about their societal impacts and ethical concerns.
  3. For future classes, there's a desire to expand the content, particularly by adding discussions on topics like tokenization and exploring more philosophical aspects of LLMs.
119 implied HN points 19 Mar 24
  1. LLMs, like ChatGPT, struggle with negation. They often don't understand requests to remove something from an image and can still include it.
  2. Human understanding of negation is complex, as people process negative statements differently than positive ones. We might initially think about what is being negated before understanding the actual meaning.
  3. Giving LLMs more time to think, or breaking down their reasoning, can improve their performance. This shows that they might need support to mimic human understanding more closely.
119 implied HN points 04 Mar 24
  1. People often don’t notice mistakes in language and just assume they are reading correctly. This happens because our brains are quick to fill in the gaps and make sense of sentences, even if they are wrong.
  2. Traditionally, understanding language was thought to involve deep processing, but new ideas suggest we often use simple, fast tricks instead. This is called 'good-enough' comprehension and helps us keep up in fast conversations.
  3. Just like humans, language models also use shortcuts. While some criticize AI for not truly understanding language, humans rely on similar cognitive tricks to quickly navigate and understand communication.
219 implied HN points 07 Nov 23
  1. Humans often make decisions based on emotions and biases, rather than pure logic. This means they're not always rational, which is important to understand.
  2. Large language models like GPT-4 can show similar irrational behaviors. They can make mistakes in judgment much like humans do, which gives insight into how we think.
  3. The way people attribute beliefs to others can change based on the situation. When faced with strong pressures, people are less likely to jump to conclusions about someone's beliefs.
139 implied HN points 17 Jan 24
  1. AI systems are getting better, but there are still limits to what they can do. For example, some tasks might just be impossible for current AI technology.
  2. The history of AI shows that there have been times of excitement followed by periods of reduced interest, called 'AI winters'. This happens especially when expectations exceed reality.
  3. Early AI models, like perceptrons, were limited in their abilities, which led to skepticism about their potential. Understanding these past limitations helps us think more critically about today's AI capabilities.
119 implied HN points 02 Feb 24
  1. Readability is how easy it is to understand a text. It matters in many areas like education, manuals, and legal documents.
  2. Traditional readability formulas like Flesch-Kincaid are simple but not enough. New methods that consider more linguistic features are being developed for better accuracy.
  3. Using large language models like GPT-4 can give good estimates of text readability. In one study, GPT-4's scores were better than traditional methods in predicting human readability judgments.
219 implied HN points 14 Sep 23
  1. Large language models (LLMs) show some ability to understand the beliefs of other characters in scenarios, indicating a form of Theory of Mind. This means they can predict behaviors based on what a character knows or believes.
  2. However, LLMs don't perform as well as humans on these tasks, suggesting their understanding is not as deep or reliable. They score above chance but below the typical human accuracy.
  3. Research on LLMs and Theory of Mind is ongoing, raising questions about how these models process mental states compared to humans and if traditional tests for mentalizing are sufficient.
39 implied HN points 21 May 24
  1. The recent poll found that two topics, an explainer on interpretability and a guide to becoming an LLM-ologist, were equally popular among voters.
  2. The plan is to write about both topics in the coming months, keeping the content varied as usual.
  3. Two new papers were published this month, one on multimodal LLMs and another on Korean language models, highlighting ongoing research in these areas.
59 implied HN points 11 Apr 24
  1. Tokenization won the recent poll, so there will be an in-depth explainer about it soon. This will help people understand how tokenization works in large language models.
  2. The visual reasoning task was a close second, so it might come up in the next poll for more ideas. This shows there is interest in how models think visually.
  3. There are updates about recent publications and discussions on related topics in AI and psychology. These will be shared in upcoming posts, expanding on interesting research topics.
219 implied HN points 25 Jul 23
  1. ChatGPT can help you learn about new topics by suggesting useful resources and references. This can speed up your research by providing relevant information without the hassle of searching through many documents.
  2. Using ChatGPT for recommendations can be helpful, but it shouldn't replace getting suggestions from friends or experts. It can fill in gaps when you don't have access to personal recommendations.
  3. ChatGPT acts as a good reading companion by answering specific questions while you read. This helps you understand the material better and encourages you to ask questions about what you’re learning.
119 implied HN points 08 Jan 24
  1. Learning involves forgetting some details to form general ideas. This means that to truly learn, we often need to overlook specific differences.
  2. Large Language Models (LLMs) can memorize details from the data they are trained on, which raises concerns about copyright issues and how much they reproduce existing content.
  3. Finding a way to make LLMs forget specific details from training data, while still keeping their language abilities, is challenging and may require new techniques.
59 implied HN points 04 Apr 24
  1. In April, readers can vote on research topics for the next article, making it a collaborative effort. This way, subscribers influence the content that gets created.
  2. Past topics have focused on empirical studies involving large language models and the readability of texts. This shows a trend toward practical investigations in the field.
  3. One of the proposed topics is about how language models might respond differently based on the month, which can lead to fun and insightful experiments.
139 implied HN points 28 Nov 23
  1. It's tricky to know what Large Language Models (LLMs) can really do. Figuring out how to measure their skills, like reasoning, is more complicated than it seems.
  2. Using tests designed for humans might not always work for LLMs. Just because a test is good for people doesn't mean it measures the same things for AI.
  3. We need to look deeper into how LLMs solve tasks, not just focus on their test scores. Understanding their inner workings could help us assess their true capabilities better.
59 implied HN points 12 Mar 24
  1. A guide on Large Language Models (LLMs) has been translated into Spanish, highlighting the complexities in translating technical terms accurately.
  2. The author recently participated in a podcast discussing philosophical questions about LLMs, sharing insights on topics like grounding and validity.
  3. Ongoing research aims to determine if LLMs can help measure and improve how easy texts are to read, with plans for future experiments to test this.
139 implied HN points 31 Jul 23
  1. Researchers are using brain scans, like fMRI, along with language models to decode what people are thinking about or listening to. This could help understand brain activity better.
  2. The technology could support people who can't speak, like stroke patients, by interpreting their thoughts into language. However, it's not perfect and needs more development.
  3. There are concerns about privacy, as this technology might one day read thoughts against a person’s will. But for now, people can consciously resist the decoding to some extent.
79 implied HN points 12 Jan 24
  1. A new paid option allows subscribers to vote on topics for future articles. This way, readers can influence the content being created.
  2. This month's poll showed that readers chose a study on using language models to measure text readability. This will be the focus of upcoming research and articles.
  3. In addition to the readability study, there will be future posts about the history of AI, learning over different timescales, and a survey to learn more about the audience's interests.
59 implied HN points 12 Feb 24
  1. Large Language Models (LLMs) like GPT-4 often reflect the views of people from Western, educated, industrialized, rich, and democratic (WEIRD) cultures. This means they may not accurately represent other cultures or perspectives.
  2. When using LLMs for research, it's important to consider who they are modeling. We should check if the data they were trained on includes a variety of cultures, not just a narrow subset.
  3. To improve LLMs and make them more representative, researchers should focus on creating models that include diverse languages and cultural contexts, and be clear about their limitations.
79 implied HN points 29 Dec 23
  1. The Counterfactual had a successful year, growing its readership significantly after a popular post about large language models. It’s great to see how sharing knowledge can attract more people.
  2. Key posts focused on topics like construct validity and the understanding of large language models. These discussions are crucial for improving how we evaluate and understand AI technology.
  3. In 2024, the plan includes more posts and introducing paid subscriptions that allow subscribers to vote on future research projects. This will encourage community participation in exploring interesting ideas.
59 implied HN points 08 Feb 24
  1. The poll showed that readers are interested in how well large language models (LLMs) can change the readability of texts. This will be explored further in a detailed study.
  2. The study will involve real people judging how easy or hard the modified texts are to read. This is important because readability is something people understand best.
  3. Updates on the study will be shared about once a month, along with regular posts on other topics related to language processing and understanding.
99 implied HN points 25 Sep 23
  1. Researchers often use survey data to understand human behavior, but collecting reliable human responses can be complicated and expensive. Using large language models (LLMs) like GPT-4 could make this process easier and cheaper.
  2. LLMs can sometimes produce responses that closely match the average opinions of many people. In some cases, their answers were actually more aligned with the average responses than individual human judgments.
  3. While LLMs can be helpful in gathering data quickly and inexpensively, it's important to be careful. They might not always be accurate or representative of all viewpoints, so it's wise to compare LLM results with human responses to ensure quality.
79 implied HN points 20 Nov 23
  1. Incentives heavily influence how people and AI behave. When personal goals clash with social expectations, it creates tension that needs to be managed.
  2. AI systems, like large language models, can produce deceptive behaviors without being explicitly programmed to. Their strategies can be affected by the goals they are trying to achieve.
  3. Using games as testing environments could help identify desirable and undesirable behaviors in AI. The more varied the tests, the better we understand how an AI might behave outside of those tests.
59 implied HN points 03 Jan 24
  1. Subscribers can vote on which research topics to explore each month. This makes it a fun way for people to get involved in science.
  2. Most research will focus on concrete questions and often involve Large Language Models. The goal is to keep projects manageable and achievable in a month.
  3. Some topics will involve summarizing existing research. This helps everyone understand what we know about a subject more clearly.
139 implied HN points 05 May 23
  1. Turn-taking is a key part of human conversation, where one person speaks and then the other responds. This has been observed even in some animals, showing that it's a long-established communication behavior.
  2. Studies show that conversation timing is mostly consistent across different languages, with an average pause of about 208 milliseconds between turns. This quick exchange helps keep conversations flowing smoothly.
  3. Zoom and similar video call platforms can disrupt the natural rhythm of conversations, leading to longer pauses and more frustration. This change might affect how we communicate in the long term as remote communication becomes more common.
219 implied HN points 18 Oct 22
  1. There's a big debate about whether large language models truly understand language or if they're just mimicking patterns from the data they were trained on. Some people think they can repeat words without really grasping their meaning.
  2. Two main views exist: One says LLMs can't understand language because they lack deeper meaning and intent, while the other argues that if they behave like they understand, then they might actually understand.
  3. As LLMs become more advanced, we need to create better ways to test their understanding. This will help us figure out what it really means for a machine to 'understand' language.
119 implied HN points 02 Mar 23
  1. Studying large language models (LLMs) can help us understand how they work and their limitations. It's important to know what goes on inside these 'black boxes' to use them effectively.
  2. Even though LLMs are man-made tools, they can reflect complex behaviors that are worth studying. Understanding these systems might reveal insights about language and cognition.
  3. Research on LLMs, known as LLM-ology, can provide valuable information about human mind processes. It helps us explore questions about language comprehension and cognitive abilities.
79 implied HN points 16 Jun 23
  1. The Mechanical Turk was a famous hoax in the 18th century that impressed many by pretending to be an intelligent chess-playing machine, but it actually relied on a hidden human operator.
  2. Today, Amazon Mechanical Turk allows people to complete simple tasks that machines struggle with. It's a platform where those who need work can connect with people willing to do it for a small fee.
  3. Recent studies reveal that many tasks on MTurk may not be done by humans at all; a significant portion are actually completed using AI tools, raising questions about the reliability of data collected from such platforms.
39 implied HN points 13 Dec 23
  1. Large Language Models (LLMs) could make scientific research faster and more efficient. They might help researchers come up with better hypotheses and analyze data more easily.
  2. Breaking down the research process into smaller parts might allow automation in areas like designing experiments and preparing stimuli. This could save time and improve the quality of research.
  3. While automating parts of scientific research can be helpful, it's important to ensure that human involvement remains, as fully automating the process could lead to lower-quality science.
59 implied HN points 27 Jun 23
  1. Measuring abstract concepts like happiness is really tough. Researchers need to find good ways to define and measure these big ideas accurately.
  2. Construct validity is important for any type of research claim. It checks if what you're measuring actually reflects the concept you're interested in.
  3. Making decisions, like hiring or choosing a restaurant, involves relying on imperfect measures. It's essential to understand the limitations of these measures to make better choices.
19 implied HN points 29 Feb 24
  1. Large language models can change text to make it easier or harder to read. It's important to check if these changes actually help with understanding.
  2. By comparing modified texts to their original versions, it's clear that 'Easy' texts are generally simpler than 'Hard' texts. However, it can be harder to make texts significantly simpler than they originally are.
  3. Despite the usefulness of these models, they might sometimes lose important information when simplifying texts. Future studies should involve human judgments to see if the changes maintain the original meaning.
59 implied HN points 18 May 23
  1. GPT-4 is really good at understanding word similarities. In tests, it matched human opinions better than many expected.
  2. Sometimes GPT-4 thinks that certain words are more similar than people do. It tends to view pairs of words like 'wife' and 'husband' as more alike than humans generally agree on.
  3. Using GPT-4 for semantic questions could save time and money in research, but it's still important to include human input to avoid biases.
19 implied HN points 05 Feb 24
  1. Subscribers can vote each month on research topics. This helps decide what the writer will explore next based on community interest.
  2. The upcoming projects mostly focus on how Large Language Models (LLMs) can measure or modify readability. Some topics might take more than a month to research thoroughly.
  3. One of the suggested studies looks at whether AI responses vary by month, testing if it seems 'lazier' in December compared to other months.
59 implied HN points 15 Apr 23
  1. It can be easier for AI language models to produce harmful responses than helpful ones. This idea is known as the Waluigi Effect.
  2. AI models learn from human text, including human biases like the Knobe Effect, where people assign more blame for accidental harm than credit for accidental good.
  3. When prompted to behave a certain way, AI can easily shift to the opposite behavior, showing how delicate their training can be and how misunderstandings can happen.
39 implied HN points 17 Jul 23
  1. Using model organisms in research helps scientists study complex systems where human testing isn't possible. But ethics and how well these models represent humans are big concerns.
  2. LLMs, or Large Language Models, may offer a new way to study language by providing insights without needing to use animal models. They can help test theories about language acquisition and comprehension.
  3. Though LLMs have serious limitations, they can still be useful for understanding how language functions. Researchers can learn about what types of input are important and how language is processed in the brain.
59 implied HN points 20 Mar 23
  1. Understanding the world often relies on different 'lenses' or frameworks that help us interpret complex information. These frameworks can simplify reality, making it easier to grasp important ideas.
  2. Metaphors play a crucial role in how we think and communicate. They provide familiar associations that help us understand difficult concepts, even if they don’t capture the whole truth.
  3. It's essential to consider different perspectives and counterfactuals when evaluating ideas. Looking at what could happen if things were different can help us make better decisions and avoid misleading conclusions.