Large language models (LLMs) can be used for tasks like email completion and code explanation, but currently need hardware accelerators beyond personal devices.
Using on-device LLMs allows greater control over data and the ability to create personalized generation models.
A community of developers is working towards enabling LLM inference locally to empower creators and researchers in utilizing these models for their projects.