alvis44/Phi-3-mini-128k-instruct-RU · Was this model trained in Russian?

Jun 19, 2024

Could you tell us more, is this a basic model or is it configured for Russian?

Owner Jun 20, 2024

I'm sorry, I haven't had time to issue a readme yet. This is a pre-trained microsoft/Phi-3-mini-128k-instruct model in Russian dataset.
In the format:
"Ниже приведена инструкция, описывающая задачу, в сочетании с вводными данными, обеспечивающими дальнейший контекст.
Напишите ответ, который соответствующим образом завершает запрос.

### Инструкция:
{instruction}

### Контекст:
{input_text}

### Ответ:
{response}"
only for completion. The model was trained both with and without context.

ganser4566

Jun 20, 2024

I'm sorry, I haven't had time to issue a readme yet. This is a pre-trained microsoft/Phi-3-mini-128k-instruct model in Russian dataset.
Thank you very much for such a detailed answer! I downloaded and am testing your model, but it doesn't speak Russian very well. If it's not classified information, how big was your Russian language dataset? I mean the volume in gigabytes. If the dataset is not secret, could you indicate its content?

alvis44

Owner Jun 21, 2024

Yes, of course the data is not secret. I used a small open dataset lksy/ru_instruct_gpt4. This is my trial work, I plan to increase the data set in the future, it will be interesting to assess the progress of quality on benchmarks.