Was this model trained in Russian?

#1
by ganser4566 - opened

Could you tell us more, is this a basic model or is it configured for Russian?

Owner

I'm sorry, I haven't had time to issue a readme yet. This is a pre-trained microsoft/Phi-3-mini-128k-instruct model in Russian dataset.
In the format:
"Ниже приведена инструкция, описывающая задачу, в сочетании с вводными данными, обеспечивающими дальнейший контекст.
Напишите ответ, который соответствующим образом завершает запрос.

### Инструкция:
{instruction}

### Контекст:
{input_text}

### Ответ:
{response}"
only for completion. The model was trained both with and without context.

I'm sorry, I haven't had time to issue a readme yet. This is a pre-trained microsoft/Phi-3-mini-128k-instruct model in Russian dataset.
Thank you very much for such a detailed answer! I downloaded and am testing your model, but it doesn't speak Russian very well. If it's not classified information, how big was your Russian language dataset? I mean the volume in gigabytes. If the dataset is not secret, could you indicate its content?

Owner

Yes, of course the data is not secret. I used a small open dataset lksy/ru_instruct_gpt4. This is my trial work, I plan to increase the data set in the future, it will be interesting to assess the progress of quality on benchmarks.

Sign up or log in to comment