Edit model card

t5_large_baseline

This model is a fine-tuned version of t5-large on an unkown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0010
  • Rouge1: 99.8958
  • Rouge2: 99.8696
  • Rougel: 99.8958
  • Rougelsum: 99.8958
  • Gen Len: 46.715

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adafactor
  • lr_scheduler_type: linear
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
0.9852 0.33 50 0.1098 55.1421 49.8248 54.4294 54.7377 19.0
0.1186 0.67 100 0.0176 58.0994 54.8973 57.7383 57.9538 19.0
0.0417 1.0 150 0.0057 58.3685 55.7353 58.279 58.2729 19.0
0.0225 1.33 200 0.0029 58.8981 56.2457 58.8202 58.7906 19.0
0.0131 1.67 250 0.0024 58.8439 56.2535 58.7557 58.7218 19.0
0.0112 2.0 300 0.0013 58.9538 56.4749 58.9322 58.8817 19.0
0.0077 2.33 350 0.0013 58.9538 56.4749 58.9322 58.8817 19.0
0.0043 2.67 400 0.0010 59.0124 56.5806 58.9867 58.9342 19.0
0.0052 3.0 450 0.0010 59.0402 56.6982 59.0385 58.986 19.0

Framework versions

  • Transformers 4.10.0.dev0
  • Pytorch 1.9.0+cu111
  • Datasets 1.11.0
  • Tokenizers 0.10.3
Downloads last month
2