Edit model card

t5-small-samsum

This model is a fine-tuned version of google-t5/t5-small on an samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6507

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 64
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 460 1.9598
2.4944 2.0 921 1.8661
2.0902 3.0 1381 1.8210
2.0173 4.0 1842 1.8009
1.9623 5.0 2302 1.7787
1.9331 6.0 2763 1.7637
1.903 7.0 3223 1.7514
1.881 8.0 3684 1.7390
1.8648 9.0 4144 1.7350
1.8463 10.0 4605 1.7242
1.8302 11.0 5065 1.7189
1.8119 12.0 5526 1.7098
1.8119 13.0 5986 1.7076
1.8007 14.0 6447 1.7057
1.7903 15.0 6907 1.6984
1.778 16.0 7368 1.6944
1.7639 17.0 7828 1.6907
1.7596 18.0 8289 1.6896
1.746 19.0 8749 1.6861
1.7342 20.0 9210 1.6860
1.732 21.0 9670 1.6808
1.719 22.0 10131 1.6760
1.7152 23.0 10591 1.6778
1.7082 24.0 11052 1.6762
1.7003 25.0 11512 1.6707
1.7003 26.0 11973 1.6722
1.6952 27.0 12433 1.6701
1.6848 28.0 12894 1.6671
1.6814 29.0 13354 1.6668
1.6743 30.0 13815 1.6637
1.6742 31.0 14275 1.6640
1.6652 32.0 14736 1.6624
1.6582 33.0 15196 1.6606
1.6575 34.0 15657 1.6605
1.6499 35.0 16117 1.6617
1.6455 36.0 16578 1.6601
1.6506 37.0 17038 1.6594
1.6506 38.0 17499 1.6556
1.637 39.0 17959 1.6570
1.6374 40.0 18420 1.6558
1.6303 41.0 18880 1.6557
1.6311 42.0 19341 1.6553
1.6234 43.0 19801 1.6570
1.619 44.0 20262 1.6537
1.6214 45.0 20722 1.6529
1.6183 46.0 21183 1.6542
1.609 47.0 21643 1.6543
1.6159 48.0 22104 1.6530
1.6101 49.0 22564 1.6524
1.6083 50.0 23025 1.6515
1.6083 51.0 23485 1.6528
1.605 52.0 23946 1.6526
1.6011 53.0 24406 1.6515
1.6028 54.0 24867 1.6517
1.6015 55.0 25327 1.6512
1.601 56.0 25788 1.6504
1.6007 57.0 26248 1.6513
1.5948 58.0 26709 1.6511
1.5973 59.0 27169 1.6515
1.5929 60.0 27630 1.6514
1.5955 61.0 28090 1.6507
1.5931 62.0 28551 1.6507
1.5939 63.0 29011 1.6507
1.5939 63.93 29440 1.6507

Framework versions

  • Transformers 4.39.1
  • Pytorch 2.2.1
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
5
Safetensors
Model size
60.5M params
Tensor type
F32
·

Finetuned from

Dataset used to train Prikshit7766/t5-small-samsum