File size: 1,540 Bytes
7bb8205 77e9e49 7bb8205 68a898f 1ad1cb3 ad08461 68a898f d0b3244 68a898f 7f7ca3e |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 |
---
datasets:
- IlyaGusev/ru_turbo_alpaca
- IlyaGusev/ru_turbo_saiga
- IlyaGusev/ru_sharegpt_cleaned
language:
- ru
pipeline_tag: conversational
---
Colab: [link](https://colab.research.google.com/drive/1IBh4FMJPOGZAkX7DYWnIKdav_ZcKatlP)
v2:
- revision 95876e3d9854e937104f623a5fb7144ca990e8ba
- wandb [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/8p3nfjqv/overview)
- 4 datasets: ru_turbo_alpaca, ru_turbo_saiga, ru_sharegpt_cleaned, oasst1_ru_main_branch
- Datasets merging script: [create_chat_set.py](https://github.com/IlyaGusev/rulm/blob/ef58f3d82d6e7b3784d42167ff69188d3766ab61/self_instruct/src/data_processing/create_chat_set.py)
- Loss: 0.942
- Context length: 2000
- Conversational template: `"<s>{role}\n{content}</s>"`
- Possible roles: `["system", "user", "bot"]`
- System prompt: `"Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им."`
v1:
- revision 1ad1cb364e3e245a7a376884111e107cfc013911
- wandb [link](https://wandb.ai/ilyagusev/rulm_self_instruct/runs/kx2uytey/overview)
- 3 datasets: ru_turbo_alpaca, ru_turbo_saiga, ru_sharegpt_cleaned
- Loss: 0.883
- Context length: 2000
- Conversational template: `"<start>{role}\n{content} <end>\n"`
- Possible roles: `["system", "user", "bot"]`.
- System prompt: `"Ты — Сайга, русскоязычный автоматический ассистент. Ты разговариваешь с людьми и помогаешь им."` |