What is LLAMA 2?
What is Meta AI LLAMA 2?
Facebook parent Meta makes public its ChatGPT rival LLAMA (Large Language Model Meta AI)
![]() |
What is LLAMA? Meta AI |
Facebook parent company Meta Platforms has built an artificial intelligence system that rivals the likes of ChatGPT and Google's Bard but it's taking a different approach: releasing it for free.
Meta CEO Mark Zuckerberg said Tuesday that the company is partnering with Microsoft to introduce the next generation of its AI large language model and making the technology, known as Llama 2, free for research and commercial use.
Much like tech peers Google and Microsoft, the social media company has long had a big research team of computer scientists devoted to advancing AI technology. But it's been overshadowed as the release of ChatGPT sparked a rush to profit off of "generative AI" tools that can create new prose, images and other media.
Meta has also tried to distinguish itself by being more open than some of its Big Tech rivals about offering a peek at the data and code it uses to build AI systems. It has argued that such openness makes it easier for outside researchers to help identify and mitigate the bias and toxicity that AI systems pick up by ingesting how real people write and communicate.
"Open source drives innovation because it enables many more developers to build with new technology," Zuckerberg said in a Facebook post Tuesday. "It also improves safety and security because when software is open, more people can scrutinize it to identify and fix potential issues. I believe it would unlock more progress if the ecosystem were more open, which is why we're open sourcing Llama 2."
Zuckerberg pointed to Meta's history of open-sourcing its AI work, such as with its development of the widely used machine-learning framework PyTorch.
But the research paper introducing the new model reflects less openness than Meta has shown previously in its work to build models that require ingesting large troves of digitized writings such as books, news articles and social media feeds.
It says the latest model was trained on "a new mix of data from publicly available sources, which does not include data from Meta's products or services," but does not specify what data was used. It does say that Meta removed data from websites known to contain a "high volume of personal information about private individuals."
Meta used the acronym LLaMA, for Large Language Model Meta AI, to describe the first version of its model, announced in February. It's now dropped the capital letters for its second version, Llama 2.
Zuckerberg said people can download its new AI models directly or through a partnership that makes them available on Microsoft's cloud platform Azure "along with Microsoft's safety and content tools."
The financial terms of that partnership were not disclosed.
While Microsoft is described by Meta as a "preferred" partner, Meta said the models will also be available through Amazon Web Services, which is Microsoft's main cloud rival, as well as AI startup Hugging Face and others.
Microsoft is also a major funder and partner of OpenAI, the maker of ChatGPT. Neither ChatGPT nor similar offerings from Microsoft or Google are open source.
Microsoft and Meta also revealed the new AI partnership at Microsoft's annual event for business customers on Tuesday. Microsoft said in a separate statement that the two companies "share a commitment to democratizing AI and its benefits and we are excited that Meta is taking an open approach." Meta already is a customer of Microsoft's Azure cloud computing platform.
Microsoft also used the virtual event, called Ignite, to reveal that it will be charging businesses a monthly fee of $30 for each user of its flagship generative AI tool, Microsoft 365 Copilot, on top of what those organizations are already paying for Microsoft services.
In the word's of Meta company:-
As part of Meta’s commitment to open science, today we are publicly releasing LLaMA (Large Language Model Meta AI), a state-of-the-art foundational large language model designed to help researchers advance their work in this subfield of AI. Smaller, more performant models such as LLaMA enable others in the research community who don’t have access to large amounts of infrastructure to study these models, further democratizing access in this important, fast-changing field.
Training smaller foundation models like LLaMA is desirable in the large language model space because it requires far less computing power and resources to test new approaches, validate others’ work, and explore new use cases. Foundation models train on a large set of unlabeled data, which makes them ideal for fine-tuning for a variety of tasks. We are making LLaMA available at several sizes (7B, 13B, 33B, and 65B parameters) and also sharing a LLaMA model card that details how we built the model in keeping with our approach to Responsible AI practices.
Over the last year, large language models — natural language processing (NLP) systems with billions of parameters — have shown new capabilities to generate creative text, solve mathematical theorems, predict protein structures, answer reading comprehension questions, and more. They are one of the clearest cases of the substantial potential benefits AI can offer at scale to billions of people.
Even with all the recent advancements in large language models, full research access to them remains limited because of the resources that are required to train and run such large models. This restricted access has limited researchers’ ability to understand how and why these large language models work, hindering progress on efforts to improve their robustness and mitigate known issues, such as bias, toxicity, and the potential for generating misinformation.
Smaller models trained on more tokens — which are pieces of words — are easier to retrain and fine-tune for specific potential product use cases. We trained LLaMA 65B and LLaMA 33B on 1.4 trillion tokens. Our smallest model, LLaMA 7B, is trained on one trillion tokens.
Like other large language models, LLaMA works by taking a sequence of words as an input and predicts a next word to recursively generate text. To train our model, we chose text from the 20 languages with the most speakers, focusing on those with Latin and Cyrillic alphabets.
There is still more research that needs to be done to address the risks of bias, toxic comments, and hallucinations in large language models. Like other models, LLaMA shares these challenges. As a foundation model, LLaMA is designed to be versatile and can be applied to many different use cases, versus a fine-tuned model that is designed for a specific task. By sharing the code for LLaMA, other researchers can more easily test new approaches to limiting or eliminating these problems in large language models. We also provide in the paper a set of evaluations on benchmarks evaluating model biases and toxicity to show the model’s limitations and to support further research in this crucial area.
To maintain integrity and prevent misuse, we are releasing our model under a noncommercial license focused on research use cases. Access to the model will be granted on a case-by-case basis to academic researchers; those affiliated with organizations in government, civil society, and academia; and industry research laboratories around the world. People interested in applying for access can find the link to the application in our research paper.
We believe that the entire AI community — academic researchers, civil society, policymakers, and industry — must work together to develop clear guidelines around responsible AI in general and responsible large language models in particular. We look forward to seeing what the community can learn — and eventually build — using LLaMA.
Source : facebook's Blog
0 Comments