Papers
arxiv:2402.16819
Nemotron-4 15B Technical Report
Published on Feb 26
·
Featured in Daily Papers
on Feb 27
Authors:
Abstract
We introduce Nemotron-4 15B, a 15-billion-parameter large multilingual language model trained on 8 trillion text tokens. Nemotron-4 15B demonstrates strong performance when assessed on English, multilingual, and coding tasks: it outperforms all existing similarly-sized open models on 4 out of 7 downstream evaluation areas and achieves competitive performance to the leading open models in the remaining ones. Specifically, Nemotron-4 15B exhibits the best multilingual capabilities of all similarly-sized models, even outperforming models over four times larger and those explicitly specialized for multilingual tasks.
Models citing this paper 0
No model linking this paper
Cite arxiv.org/abs/2402.16819 in a model README.md to link it from this page.
Datasets citing this paper 0
No dataset linking this paper
Cite arxiv.org/abs/2402.16819 in a dataset README.md to link it from this page.
Spaces citing this paper 0
No Space linking this paper
Cite arxiv.org/abs/2402.16819 in a Space README.md to link it from this page.