The Counterfactual • 599 implied HN points • 28 Jul 23
- Large language models, like ChatGPT, work by predicting the next word based on patterns they learn from tons of text. They don’t just use letters like we do; they convert words into numbers to understand their meanings better.
- These models handle the many meanings of words by changing their representation based on context. This means that the same word could have different meanings depending on how it's used in a sentence.
- The training of these models does not require labeled data. Instead, they learn by guessing the next word in a sentence and adjusting their processes based on whether they are right or wrong, which helps them improve over time.