ai-forever commited on
Commit
aa2b602
1 Parent(s): 9e3c488

add model card

Browse files
Files changed (1) hide show
  1. README.md +14 -0
README.md ADDED
@@ -0,0 +1,14 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ru
4
+ tags:
5
+ - PyTorch
6
+ - Transformers
7
+ thumbnail: "https://github.com/sberbank-ai/ru-gpts"
8
+ ---
9
+
10
+ # rugpt3large\_based\_on\_gpt2
11
+ Model was trained with sequence length 1024 using transformers lib by [SberDevices](https://sberdevices.ru/) team on 80B tokens for 3 epochs. After that model was finetuned 1 epoch with sequence length 2048.
12
+
13
+ Total training time was around 14 days on 128 GPUs for 1024 context and few days on 16 GPUs for 2048 context.
14
+ Final perplexity on test set is `13.6`.