Edit model card
YAML Metadata Error: "language[0]" must only contain lowercase characters
YAML Metadata Error: "language[0]" with value "zh_CN" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.
YAML Metadata Error: "language[1]" must only contain lowercase characters
YAML Metadata Error: "language[1]" with value "zh_CN" is not valid. It must be an ISO 639-1, 639-2 or 639-3 code (two/three letters), or a special value like "code", "multilingual". If you want to use BCP-47 identifiers, you can specify them in language_bcp47.

tmp

This model is a fine-tuned version of google/mt5-small on an unkown dataset. It achieves the following results on the evaluation set:

  • Loss: nan
  • Bleu: 0.0099
  • Gen Len: 3.3917

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1024
  • eval_batch_size: 1024
  • seed: 13
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2048
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 1 nan 0.0114 3.3338
No log 2.0 2 nan 0.0114 3.3338
No log 3.0 3 nan 0.0114 3.3338
No log 4.0 4 nan 0.0114 3.3338
No log 5.0 5 nan 0.0114 3.3338
No log 6.0 6 nan 0.0114 3.3338
No log 7.0 7 nan 0.0114 3.3338
No log 8.0 8 nan 0.0114 3.3338
No log 9.0 9 nan 0.0114 3.3338
No log 10.0 10 nan 0.0114 3.3338
No log 11.0 11 nan 0.0114 3.3338
No log 12.0 12 nan 0.0114 3.3338
No log 13.0 13 nan 0.0114 3.3338
No log 14.0 14 nan 0.0114 3.3338
No log 15.0 15 nan 0.0114 3.3338
No log 16.0 16 nan 0.0114 3.3338
No log 17.0 17 nan 0.0114 3.3338
No log 18.0 18 nan 0.0114 3.3338
No log 19.0 19 nan 0.0114 3.3338
No log 20.0 20 nan 0.0114 3.3338

Framework versions

  • Transformers 4.8.2
  • Pytorch 1.8.1+cu111
  • Datasets 1.9.0
  • Tokenizers 0.10.3
Downloads last month
1
Hosted inference API
This model can be loaded on the Inference API on-demand.