Similar to its name, Large Language Models (LLMs) can be quite complicated and need powerful computational resources and powerful servers. However, its rising demand seems to have given birth to compact LLMs models. These new models can fit even into your phone.
In addition to this many LLMs are also paid and are based on premium pricing models. But looking at the users demand, many companies have come with open-source LLMs that can be optimised and fitted into your phone. These are:
Phi-2: Phi-2 can be quantised to lower bit-widths like 4-bit or 3-bit precision, significantly reducing the model size to around 1.17-1.48 GB to run efficiently on mobile devices with limited memory and computational resources. The model has been trained on a large corpus of web data. This LLM can be used to perform tasks involving common sense reasoning, language understanding, and logical reasoning.
Gemma 2B: With Gemma 2B you can get high performance in spite of its small size.It uses a multi-query attention mechanism, which helps reduce memory bandwidth requirements during inference.This is advantageous for on-device scenarios where memory bandwidth is often limited. Gemma 2B has a good track for strong results for language understanding, reasoning, and safety.
LLMaMA -2-7B: This can run on devices with 6GB+ RAM. This model requires devices with sufficient RAM and might not make it up to the speed of cloud-based models. For developers looking to create intelligent language-based features that run directly on smartphones, LLMaMA -2-7B can be a good option.
Falcon-RW-1B: Falcon-RW-1B is good for resource-constrained devices like smartphones.This model can add conversational capabilities to the Falcon-RW-1B-Instruct-OpenOrca model. This can improve user engagement, expand use cases and provide accessibility for resource-constrained environments such as smartphones.
StableLM-3B: StableLM-3B can be quantised to lower bit-widths like 4-bit precision, eventually reducing the model size to about 3.6 GB to make it run efficiently on smartphones. Reportedly, StableLM-3B seems to have outperformed Stable’s own 7B StableLM-Base-Alpha-v2.