Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Chatrag-Deberta is a small lightweight LLM to predict whether a question should retrieve additional information with RAG or not.

Chatrag-Deberta is based on Deberta-v3-large, a 304M encoder-decoder. Its initial version was fine-tuned on 20,000 examples of questions annotated by Mistral 7B.

Use

A typical example of inference with Chatrag-Deberta is provided in the Google Colab demo or with inference_chatrag.py

For every submitted text, Chatrag-Deberta will output a range of probabilities to require RAG or not.

This makes it possible to adjust a threshold of activation depending on whether more or less RAG is desirable in the system.

Query Prob Result
Comment puis-je renouveler un passeport ? 0.988455 RAG
Combien font deux et deux ? 0.041475 No-RAG
Écris un début de lettre de recommandation pour la Dinum 0.103086 No-RAG
Downloads last month
59
Safetensors
Model size
279M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using AgentPublic/chatrag-deberta 1