Update README.md
Browse files
README.md
CHANGED
@@ -56,7 +56,7 @@ To transfer knowledge from English model to Czech, we developed a simple method
|
|
56 |
Figure 4: Ablation: Test perplexity over the course of training for vocabulary swap method on TinyLLAMA. Our method (green curve) vs TinyLLAMA training from scratch (blue curve).
|
57 |
|
58 |
The vocabulary swap was done the same way as our [Czech-GPT-2](https://huggingface.co/BUT-FIT/Czech-GPT-2-XL-133k) model (check it out for comprehensive description.)
|
59 |
-
|
60 |
|
61 |
## Hyperparameters
|
62 |
Not mentioned hyperparameters were kept the same as for MPT.
|
|
|
56 |
Figure 4: Ablation: Test perplexity over the course of training for vocabulary swap method on TinyLLAMA. Our method (green curve) vs TinyLLAMA training from scratch (blue curve).
|
57 |
|
58 |
The vocabulary swap was done the same way as our [Czech-GPT-2](https://huggingface.co/BUT-FIT/Czech-GPT-2-XL-133k) model (check it out for comprehensive description.)
|
59 |
+
For CSMPT7b, we managed to align 4,177 english tokens with corresponding czech tokens.
|
60 |
|
61 |
## Hyperparameters
|
62 |
Not mentioned hyperparameters were kept the same as for MPT.
|