ai-forever
commited on
Commit
•
ad40cd4
1
Parent(s):
53d8b4c
Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -25,7 +25,7 @@ It's one of the models derived from the base [mGPT-XL (1.3B)](https://huggingfac
|
|
25 |
|
26 |
We've found additional data for 23 languages most of which are considered as minor and decided to further tune the base model. **Buryat mGPT 1.3B** was trained for another 1000 steps with batch_size=4 and context window of **2048** tokens on 1 A100.
|
27 |
|
28 |
-
Final perplexity for this model on validation is 17.63
|
29 |
|
30 |
![](https://i.imgur.com/v0x3Lxe.png)
|
31 |
|
|
|
25 |
|
26 |
We've found additional data for 23 languages most of which are considered as minor and decided to further tune the base model. **Buryat mGPT 1.3B** was trained for another 1000 steps with batch_size=4 and context window of **2048** tokens on 1 A100.
|
27 |
|
28 |
+
Final perplexity for this model on validation is **17.63**.
|
29 |
|
30 |
![](https://i.imgur.com/v0x3Lxe.png)
|
31 |
|