Multilingual Support

#65
by abrehmaaan - opened

Hello community,

Has anyone used Mistral 7b for RAG with documents in languages other than English? I have tried, but it provides unsatisfactory responses.

Many popular LLMs are trained on publicly available web data which is predominantly English text.

Due to this the tokenizer learned on the pre-training data will not represent other languages well.

I have similar requirement for Indic languages.

Sign up or log in to comment