Papers
arxiv:2504.18225

Even Small Reasoners Should Quote Their Sources: Introducing the Pleias-RAG Model Family

Published on Apr 25
· Submitted by Pclanglais on Apr 28

Abstract

Two new mid-sized reasoning models, Pleias-RAG-350m and Pleias-RAG-1B, perform competitively on RAG benchmarks and support multilingual citation and grounding.

AI-generated summary

We introduce a new generation of small reasoning models for RAG, search, and source summarization. Pleias-RAG-350m and Pleias-RAG-1B are mid-trained on a large synthetic dataset emulating the retrieval of a wide variety of multilingual open sources from the Common Corpus. They provide native support for citation and grounding with literal quotes and reintegrate multiple features associated with RAG workflows, such as query routing, query reformulation, and source reranking. Pleias-RAG-350m and Pleias-RAG-1B outperform SLMs below 4 billion parameters on standardized RAG benchmarks (HotPotQA, 2wiki) and are competitive with popular larger models, including Qwen-2.5-7B, Llama-3.1-8B, and Gemma-3-4B. They are the only SLMs to date maintaining consistent RAG performance across leading European languages and ensuring systematic reference grounding for statements. Due to their size and ease of deployment on constrained infrastructure and higher factuality by design, the models unlock a range of new use cases for generative AI.

Community

Paper author Paper submitter

Detailed model paper describing the mid-training recipe of Pleias-350M (https://huggingface.co/PleIAs/Pleias-RAG-350M) and Pleias-1B (https://huggingface.co/PleIAs/Pleias-RAG-1B).

Currently SOTA model in their size range for RAG.

Sign up or log in to comment

Models citing this paper 2

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2504.18225 in a dataset README.md to link it from this page.

Spaces citing this paper 1

Collections including this paper 1