Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Our cross-lingual example checkpoint.

Model description

The model is initialized from our monolingual model, and is trained on parallel data (205000 steps) <-> AllNLI (2600 steps), going back and forth for three rounds. This model is the last round checkpoint. We recommend using it with A100 GPU, aligning with training.

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 128
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 200
  • training_steps: 2600

Framework versions

  • Transformers 4.17.0
  • Pytorch 1.11.0
  • Datasets 2.14.7.dev0
  • Tokenizers 0.15.1
Downloads last month
0