t5-abs-2309-1054-lr-0.0001-bs-5-maxep-20
This model is a fine-tuned version of google-t5/t5-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 4.0305
- Rouge/rouge1: 0.4716
- Rouge/rouge2: 0.2252
- Rouge/rougel: 0.4006
- Rouge/rougelsum: 0.402
- Bertscore/bertscore-precision: 0.8972
- Bertscore/bertscore-recall: 0.8983
- Bertscore/bertscore-f1: 0.8976
- Meteor: 0.4354
- Gen Len: 41.2455
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 5
- eval_batch_size: 5
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 10
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 20
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge/rouge1 | Rouge/rouge2 | Rouge/rougel | Rouge/rougelsum | Bertscore/bertscore-precision | Bertscore/bertscore-recall | Bertscore/bertscore-f1 | Meteor | Gen Len |
---|---|---|---|---|---|---|---|---|---|---|---|---|
0.0239 | 1.0 | 87 | 3.5307 | 0.4777 | 0.229 | 0.409 | 0.4106 | 0.8977 | 0.8993 | 0.8984 | 0.4382 | 41.0545 |
0.0141 | 2.0 | 174 | 3.6667 | 0.4765 | 0.2246 | 0.4059 | 0.4075 | 0.9001 | 0.8985 | 0.8991 | 0.429 | 39.2364 |
0.027 | 3.0 | 261 | 3.7158 | 0.4704 | 0.219 | 0.3992 | 0.3991 | 0.8956 | 0.8967 | 0.896 | 0.4319 | 40.8455 |
0.0247 | 4.0 | 348 | 3.7320 | 0.4663 | 0.2173 | 0.3945 | 0.3947 | 0.8959 | 0.8973 | 0.8965 | 0.4271 | 41.6 |
0.0225 | 5.0 | 435 | 3.8031 | 0.4767 | 0.2219 | 0.4017 | 0.4025 | 0.8975 | 0.8977 | 0.8975 | 0.4341 | 40.1 |
0.0196 | 6.0 | 522 | 3.8516 | 0.4703 | 0.2223 | 0.3989 | 0.3996 | 0.8958 | 0.8977 | 0.8967 | 0.4337 | 41.4 |
0.0168 | 7.0 | 609 | 3.9028 | 0.4747 | 0.227 | 0.4023 | 0.4029 | 0.8968 | 0.8987 | 0.8976 | 0.4378 | 41.3 |
0.0165 | 8.0 | 696 | 3.9116 | 0.4676 | 0.2224 | 0.3955 | 0.397 | 0.8965 | 0.8974 | 0.8968 | 0.4305 | 41.4727 |
0.0153 | 9.0 | 783 | 3.9268 | 0.4737 | 0.2288 | 0.4016 | 0.4025 | 0.8965 | 0.8984 | 0.8973 | 0.4411 | 41.4545 |
0.0149 | 10.0 | 870 | 3.9513 | 0.48 | 0.2329 | 0.4095 | 0.4101 | 0.8989 | 0.8997 | 0.8992 | 0.4438 | 41.0273 |
0.0142 | 11.0 | 957 | 3.9677 | 0.475 | 0.226 | 0.4037 | 0.4043 | 0.8949 | 0.8987 | 0.8967 | 0.4474 | 42.6182 |
0.0132 | 12.0 | 1044 | 3.9769 | 0.4703 | 0.2243 | 0.3977 | 0.3986 | 0.8967 | 0.8977 | 0.8971 | 0.4359 | 41.0182 |
0.0128 | 13.0 | 1131 | 3.9994 | 0.4695 | 0.2232 | 0.3987 | 0.3996 | 0.8958 | 0.8983 | 0.8969 | 0.4401 | 42.0545 |
0.012 | 14.0 | 1218 | 4.0018 | 0.471 | 0.2252 | 0.3992 | 0.3991 | 0.8963 | 0.8989 | 0.8975 | 0.4397 | 41.8909 |
0.0104 | 15.0 | 1305 | 4.0231 | 0.4799 | 0.2297 | 0.4066 | 0.4076 | 0.8975 | 0.8995 | 0.8984 | 0.446 | 41.6091 |
0.0104 | 16.0 | 1392 | 4.0239 | 0.4758 | 0.2309 | 0.4057 | 0.4059 | 0.8982 | 0.8994 | 0.8987 | 0.4439 | 41.3636 |
0.0094 | 17.0 | 1479 | 4.0272 | 0.4752 | 0.2275 | 0.4035 | 0.4045 | 0.8977 | 0.8991 | 0.8983 | 0.4404 | 41.6455 |
0.0093 | 18.0 | 1566 | 4.0272 | 0.4736 | 0.2264 | 0.4026 | 0.4036 | 0.8973 | 0.8988 | 0.8979 | 0.4394 | 41.7545 |
0.0098 | 19.0 | 1653 | 4.0307 | 0.4736 | 0.2258 | 0.4018 | 0.403 | 0.8971 | 0.8984 | 0.8976 | 0.4362 | 41.1455 |
0.0084 | 20.0 | 1740 | 4.0305 | 0.4716 | 0.2252 | 0.4006 | 0.402 | 0.8972 | 0.8983 | 0.8976 | 0.4354 | 41.2455 |
Framework versions
- Transformers 4.44.0
- Pytorch 2.4.0
- Datasets 2.21.0
- Tokenizers 0.19.1
- Downloads last month
- 3
Model tree for roequitz/t5-abs-2309-1054-lr-0.0001-bs-5-maxep-20
Base model
google-t5/t5-base