Text Generation
Transformers
Safetensors
Czech
mpt
custom_code
text-generation-inference
Inference Endpoints
mfajcik commited on
Commit
7531152
1 Parent(s): 7fc10ae

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -57,7 +57,7 @@ Figure 3: Test loss closeup, testing performed on split of internal-corpus #1. S
57
  ### Vocabulary Swap
58
  To transfer knowledge from English model to Czech, we developed a simple method that (i) aligns several tokens between two vocabularies and (ii) copies the embeddings from original language to new language.
59
  <img src="figures/tllama_test.png" width="900"/>
60
- Figure 4: Test perplexity over the course of training for vocabulary swap method on TinyLLAMA. Our method (green curve) vs TinyLLAMA training from scratch (blue curve).
61
 
62
  The vocabulary swap was done the same way as our [Czech-GPT-2](https://huggingface.co/BUT-FIT/Czech-GPT-2-XL-133k) model (check it out for comprehensive description.)
63
  For CSMPT7b, we managed to align 4,177 english tokens with corresponding czech tokens.
 
57
  ### Vocabulary Swap
58
  To transfer knowledge from English model to Czech, we developed a simple method that (i) aligns several tokens between two vocabularies and (ii) copies the embeddings from original language to new language.
59
  <img src="figures/tllama_test.png" width="900"/>
60
+ Figure 4: Test perplexity over the course of training for vocabulary swap (swapping 1.7K tokens) method on TinyLLAMA. Our method (green curve) vs TinyLLAMA training from scratch (blue curve).
61
 
62
  The vocabulary swap was done the same way as our [Czech-GPT-2](https://huggingface.co/BUT-FIT/Czech-GPT-2-XL-133k) model (check it out for comprehensive description.)
63
  For CSMPT7b, we managed to align 4,177 english tokens with corresponding czech tokens.