milmor commited on
Commit
a4946fb
1 Parent(s): aed0a76

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -61,10 +61,10 @@ Also, we collected 3,000 extra samples from the web to increase the data.
61
  We employ two training stages using a multilingual T5-small. The advantage of this model is that it can handle different vocabularies and prefixes. T5-small is pre-trained on different tasks and languages (French, Romanian, English, German).
62
 
63
  ### Training-stage 1 (learning Spanish)
64
- In training stage 1, we first introduce Spanish to the model. The goal is to learn a new language rich in data (Spanish) and not lose the previous knowledge. We use the English-Spanish [Anki](https://www.manythings.org/anki/) dataset, which consists of 118.964 text pairs. Next, we train the model till convergence, adding the prefix "Translate Spanish to English: "
65
 
66
  ### Training-stage 2 (learning Nahuatl)
67
- We use the pre-trained Spanish-English model to learn Spanish-Nahuatl. Since the amount of Nahuatl pairs is limited, we also add 20,000 samples from the English-Spanish Anki dataset to our dataset. This two-task training avoids overfitting and makes the model more robust.
68
 
69
  ### Training setup
70
  We train the models on the same datasets for 660k steps using batch size = 16 and a learning rate of 2e-5.
@@ -79,7 +79,7 @@ We evaluate the model on the same 505 validation Nahuatl sentences for a fair co
79
  | True | 1.31 | 6.18 | 28.21 |
80
 
81
 
82
- The English-Spanish pretraining improves BLEU and Chrf and leads to faster convergence. Is it possible to reproduce the evaluation on the [eval.ipynb](https://github.com/milmor/spanish-nahuatl-translation/blob/main/eval.ipynb) notebook.
83
 
84
  ## References
85
  - Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits
 
61
  We employ two training stages using a multilingual T5-small. The advantage of this model is that it can handle different vocabularies and prefixes. T5-small is pre-trained on different tasks and languages (French, Romanian, English, German).
62
 
63
  ### Training-stage 1 (learning Spanish)
64
+ In training stage 1, we first introduce Spanish to the model. The goal is to learn a new language rich in data (Spanish) and not lose the previous knowledge. We use the English-Spanish [Anki](https://www.manythings.org/anki/) dataset, which consists of 118.964 text pairs. The model is trained till convergence, adding the prefix "Translate Spanish to English: "
65
 
66
  ### Training-stage 2 (learning Nahuatl)
67
+ We use the pre-trained Spanish-English model to learn Spanish-Nahuatl. Since the amount of Nahuatl pairs is limited, we also add 20,000 samples from the English-Spanish Anki dataset. This two-task training avoids overfitting and makes the model more robust.
68
 
69
  ### Training setup
70
  We train the models on the same datasets for 660k steps using batch size = 16 and a learning rate of 2e-5.
 
79
  | True | 1.31 | 6.18 | 28.21 |
80
 
81
 
82
+ The English-Spanish pretraining improves BLEU and Chrf and leads to faster convergence. The evaluation is available on the [eval.ipynb](https://github.com/milmor/spanish-nahuatl-translation/blob/main/eval.ipynb) notebook.
83
 
84
  ## References
85
  - Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, and Peter J Liu. 2019. Exploring the limits