Was this model trained in Russian?
Could you tell us more, is this a basic model or is it configured for Russian?
I'm sorry, I haven't had time to issue a readme yet. This is a pre-trained microsoft/Phi-3-mini-128k-instruct model in Russian dataset.
In the format:
"Ниже приведена инструкция, описывающая задачу, в сочетании с вводными данными, обеспечивающими дальнейший контекст.
Напишите ответ, который соответствующим образом завершает запрос.
### Инструкция:
{instruction}
### Контекст:
{input_text}
### Ответ:
{response}"
only for completion. The model was trained both with and without context.
I'm sorry, I haven't had time to issue a readme yet. This is a pre-trained microsoft/Phi-3-mini-128k-instruct model in Russian dataset.
Thank you very much for such a detailed answer! I downloaded and am testing your model, but it doesn't speak Russian very well. If it's not classified information, how big was your Russian language dataset? I mean the volume in gigabytes. If the dataset is not secret, could you indicate its content?
Yes, of course the data is not secret. I used a small open dataset lksy/ru_instruct_gpt4. This is my trial work, I plan to increase the data set in the future, it will be interesting to assess the progress of quality on benchmarks.