Edit model card

rugpt3small_based_on_gpt2

Model was trained with sequence length 1024 using transformers by SberDevices team on 80B tokens around 3 epoch. After that model was finetuned on 2048 context.

Total training time took around one week on 32 GPUs.

Downloads last month
18,748
Hosted inference API
Text Generation
Examples
Examples
This model can be loaded on the Inference API on-demand.

Spaces using sberbank-ai/rugpt3small_based_on_gpt2