arxiv:2302.13971

LLaMA: Open and Efficient Foundation Language Models

Published on Feb 27, 2023

Upvote

Authors:

Gautier Izacard ,

Xavier Martinet ,

Abstract

We introduce LLaMA, a collection of foundation language models ranging from 7B to 65B parameters. We train our models on trillions of tokens, and show that it is possible to train state-of-the-art models using publicly available datasets exclusively, without resorting to proprietary and inaccessible datasets. In particular, LLaMA-13B outperforms GPT-3 (175B) on most benchmarks, and LLaMA-65B is competitive with the best models, Chinchilla-70B and PaLM-540B. We release all our models to the research community.

View arXiv page View PDF Add to collection

Community

clem

Mar 17, 2023

Amazing model!

osanseviero

Mar 17, 2023

Yes, but would love to get access to it :D

dalnk

Mar 17, 2023

absolutely incredible based on all my testing so far!

VictorSanh

Mar 17, 2023

So appreciative of this work!
Would love to have access hehe

KnutJaegersberg

Mar 18, 2023

It's amazing work. Better they share the weights for research than no sharing at all, but they should share it with a license which allows commercial usage, too. It would feedback educatively, because LeCun struggles to see the full potential LLMs have. Multimodality is handy, not key.