|
--- |
|
language: |
|
- ru |
|
tags: |
|
- PyTorch |
|
- Transformers |
|
thumbnail: "https://github.com/sberbank-ai/ru-gpts" |
|
--- |
|
|
|
# rugpt3small\_based\_on\_gpt2 safetensors variant |
|
Model was trained with sequence length 1024 using transformers by [SberDevices](https://sberdevices.ru/) team on 80B tokens around 3 epoch. After that model was finetuned on 2048 context. |
|
|
|
Total training time took around one week on 32 GPUs. |
|
|
|
# Authors |
|
+ NLP core team RnD [Telegram channel](https://t.me/nlpcoreteam): |
|
+ Dmitry Zmitrovich |
|
+ Safetensors variant by .[Sashkanik13](https://huggingface.co/Sashkanik13): |
|
|