Edit model card

t5-base-pt-asqa-cb

This model is a fine-tuned version of din0s/t5-base-msmarco-nlgen-cb on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7735
  • Rougelsum: 26.3056

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rougelsum
No log 1.0 273 2.9031 24.6325
3.2031 2.0 546 2.8656 24.9190
3.2031 3.0 819 2.8442 25.1197
3.0839 4.0 1092 2.8303 25.2855
3.0839 5.0 1365 2.8189 25.4891
3.0276 6.0 1638 2.8099 25.6116
3.0276 7.0 1911 2.8036 25.7411
3.0043 8.0 2184 2.7976 25.8238
3.0043 9.0 2457 2.7930 25.9201
2.9791 10.0 2730 2.7890 26.0322
2.9545 11.0 3003 2.7851 26.0934
2.9545 12.0 3276 2.7826 26.1574
2.9344 13.0 3549 2.7802 26.2041
2.9344 14.0 3822 2.7785 26.2330
2.9252 15.0 4095 2.7769 26.2394
2.9252 16.0 4368 2.7756 26.2676
2.9109 17.0 4641 2.7747 26.2864
2.9109 18.0 4914 2.7740 26.3146
2.9103 19.0 5187 2.7736 26.2993
2.9103 20.0 5460 2.7735 26.3056

Framework versions

  • Transformers 4.23.0.dev0
  • Pytorch 1.12.1+cu102
  • Datasets 2.4.0
  • Tokenizers 0.12.1
Downloads last month
1