cointegrated commited on
Commit
09ac636
1 Parent(s): d65d330

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -1
README.md CHANGED
@@ -188,4 +188,5 @@ This model was fine-tuned on the [slone/nllb-200-10M-sample](https://huggingface
188
  the [NLLB dataset](https://huggingface.co/datasets/allenai/nllb) with 175 languages, using only the samples with BLASER score above 3.5.
189
 
190
  Because of its small size, it is really bad at translation, but can serve as a base model for further fine-tuning for a small number of languages.
191
- It is recommended to prune the vocabulary of this model before fine-tuning, to preserve only the tokens used with the intended languages.
 
 
188
  the [NLLB dataset](https://huggingface.co/datasets/allenai/nllb) with 175 languages, using only the samples with BLASER score above 3.5.
189
 
190
  Because of its small size, it is really bad at translation, but can serve as a base model for further fine-tuning for a small number of languages.
191
+ It is recommended to [prune the vocabulary of this model](https://towardsdatascience.com/how-to-adapt-a-multilingual-t5-model-for-a-single-language-b9f94f3d9c90)
192
+ before fine-tuning, to preserve only the tokens used with the intended languages.