YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Our cross-lingual example checkpoint.
Model description
The model is initialized from our monolingual model, and is trained on parallel data (205000 steps) <-> AllNLI (2600 steps), going back and forth for three rounds. This model is the last round checkpoint. We recommend using it with A100 GPU, aligning with training.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 3e-05
- train_batch_size: 128
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 200
- training_steps: 2600
Framework versions
- Transformers 4.17.0
- Pytorch 1.11.0
- Datasets 2.14.7.dev0
- Tokenizers 0.15.1
- Downloads last month
- 0