Cobus Greyling on LLMs, NLU, NLP, chatbots & voicebots • 0 implied HN points • 14 Mar 23
- Speech to text has unique challenges, like disfluencies that happen when people talk. These differences can help improve how ChatGPT understands and processes voice input.
- Whisper can provide ChatGPT with access to lots of audio data. This means it can learn from a wider variety of information, which can make responses better.
- The future of AI models includes using different types of data, not just text. This shift towards multi-modal models means ChatGPT can eventually handle audio, images, and more, making it more versatile.