Bringing Open Large Language Models to Consumer Devices

The document discusses the increasing desire among diverse user groups to independently utilize their own open-source Large Language Models (LLMs) within local environments. The latest movement has been focusing on introducing more permissive truly Open LLMs to cater to both research and commercial interests. The MLC LLM project aims to make Open LLMs accessible by making them possible and convenient to deploy on browsers, mobile devices, consumer-class GPUs, and other platforms. The project brings RedPajama support to a wide range of consumer devices with hardware acceleration. Machine Learning Compilation (MLC) from TVM Unity plays a critical role in enabling efficient deployment and democratization of Open LLMs. MLC LLM allows convenient weight customization that users only need to provide a directory in Huggingface format. The iOS app allows users to download personalized weights of the same model on-demand via a link to model artifacts without re-compilation or redeployment. The MLC LLM project is a fairly young project, and there are a lot of things to be done, including bringing documentation for developers, modularizing the overall libraries, and expanding the prebuilt MLC pip development package. The project is done in collaboration with ETH Zürich, OctoML, CMU Catalyst, and the MLC community. The project is only possible thanks to the open-source ecosystems that they stand on, including the Apache TVM community, PyTorch and Hugging Face communities, and the teams behind RedPajama, Dolly, Vicuna, SentencePiece, LLaMA, and Alpaca.

full article

Leave a Reply