Gonzo ML β’ 0 implied HN points β’ 10 Mar 24
- OLMo is an open language model created by Allen AI, differentiating itself by being completely open-source including logs, checkpoints, and evaluation scripts under the Apache 2.0 License.
- OLMo comprises three models: 1B, 7B, and 65B, demonstrating improvements in classic transformer decoders similar to GPT, such as specific tokenization for PII and non-parametric layer normalization.
- OLMo was trained on data from their own dataset Dolma with plans to expand beyond English, showcasing their training process with PyTorch FSDP and evaluation using their benchmark Paloma and the Catwalk framework.