rujengelal's picture
End of training
ae5c243 verified
|
raw
history blame
No virus
2.06 kB
metadata
license: apache-2.0
base_model: t5-small
tags:
  - generated_from_trainer
metrics:
  - bleu
model-index:
  - name: my_awesome_english_to_nepali_tst
    results: []

my_awesome_english_to_nepali_tst

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.7758
  • Bleu: 4.076
  • Gen Len: 17.595

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 32 1.8790 3.86 17.665
No log 2.0 64 1.8311 4.0878 17.645
No log 3.0 96 1.8105 4.0976 17.615
No log 4.0 128 1.7988 4.1081 17.615
No log 5.0 160 1.7911 4.057 17.625
No log 6.0 192 1.7854 4.0552 17.61
No log 7.0 224 1.7812 4.0714 17.61
No log 8.0 256 1.7780 4.085 17.595
No log 9.0 288 1.7764 4.076 17.595
No log 10.0 320 1.7758 4.076 17.595

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2