Meta Introduces LLaMA, Large Language Model That Runs On Less Power 02/27/2023

Meta will release a new large language model (LLM) it calls LLaMA, which is designed to help researchers advance their work in artificial intelligence, the company’s CEO Mark Zuckerberg announced in a Facebook post on Friday.

LLMs are what underpins applications such as OpenAi’s ChatGPT technology, Microsoft Bing AI, and Google’s Bard.

Meta’s new model, developed by its Fundamental AI Research (FAIR) team, comes as large tech companies are racing to integrate AI into their platforms and products.

“Like other large language models, LLaMA works by taking a sequence of words as an input and predicts a next word to recursively generate text,” the company said in a post published today. “To train our model, we chose text from the 20 languages with the most speakers, focusing on those with Latin and Cyrillic alphabets.”

The company acknowledged there’s more research to be done to address the risks of bias, toxic comments, and hallucinations in LLMs. LLaMA shares these challenges, too.

The plan is to share the code for LLaMA to enable researchers to more easily test new approaches to limiting or eliminating these problems in LLMs.

Along with the announcement, Meta published a research paper with a set of evaluations on benchmarks that evaluate model biases and toxicity to demonstrate the model’s limitations and to support further research in this area.

“LLMs have shown a lot of promise in generating text, having conversations, summarizing written material, and more complicated tasks like solving math theorems or predicting protein structures,” Zuckerberg wrote in a post.

The focus of the work is to train a series of language models that achieve the best possible performance at various inference budgets, by training on more tokens than what is typically used. Meta claims, in a research paper, that LLaMA-13B outperforms GPT-3 on most benchmarks, despite being 10× smaller.

“We believe that this model will help democratize the access and study of LLMs, since it can be run on a single GPU,” which is a substantial advancement to the power required to run Bard, without pointing out a specific model, researcher wrote.

Once trained, LLaMA-13B can also run on one data center-grade Nvidia Tesla V100 GPU. That will help smaller companies wanting to run tests on these systems, but it doesn’t mean much for lone researchers for whom such equipment is out of reach.