Edit model card


This model is a fine-tuned checkpoint bart-large-50-man-to-many-mmt fine-tuned for Siddha Yoga Hindi to English translation. It was introduced in Multilingual Translation with Extensible Multilingual Pretraining and Finetuning paper:https://arxiv.org/pdf/2008.00401.pdf

The model can translate directly between any pair of languages. To translate the target language, the target language ID is forced as the first generated token. To force the target language as the first generated token, pass the forced_bos_token_id parameter to the generated model.

This model was fine-tuned as part of the Dissertation project in Data Science at BITS PILANI by Nishant Chhetri. Code to use the model for inference:

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Framework versions

  • Transformers 4.33.3
  • Pytorch 2.0.1+cu118
  • Datasets 2.14.5
  • Tokenizers 0.13.3
Downloads last month

Finetuned from