--- license: apache-2.0 base_model: t5-small tags: - generated_from_keras_callback model-index: - name: Mitsuha21/t5-small-finetuned-xsum results: [] --- # Mitsuha21/t5-small-finetuned-xsum This model is a fine-tuned version of [t5-small](https://huggingface.co/t5-small) on an unknown dataset. It achieves the following results on the evaluation set: - Train Loss: 1.8702 - Validation Loss: 1.4831 - Train Rouge1: 46.2436 - Train Rouge2: 26.6188 - Train Rougel: 42.7423 - Train Rougelsum: 42.6771 - Train Gen Len: 13.5220 - Epoch: 19 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - optimizer: {'name': 'AdamWeightDecay', 'learning_rate': 2e-05, 'decay': 0.0, 'beta_1': 0.9, 'beta_2': 0.999, 'epsilon': 1e-07, 'amsgrad': False, 'weight_decay_rate': 0.01} - training_precision: float32 ### Training results | Train Loss | Validation Loss | Train Rouge1 | Train Rouge2 | Train Rougel | Train Rougelsum | Train Gen Len | Epoch | |:----------:|:---------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:-----:| | 4.6083 | 3.6627 | 20.0685 | 4.3594 | 18.1869 | 18.1861 | 18.0610 | 0 | | 3.7845 | 3.2563 | 24.9949 | 6.4551 | 22.9249 | 22.9713 | 15.6122 | 1 | | 3.4628 | 3.0291 | 28.1125 | 7.7006 | 25.8046 | 25.8162 | 14.4415 | 2 | | 3.2508 | 2.8548 | 29.8432 | 8.8578 | 27.4137 | 27.4718 | 14.2268 | 3 | | 3.0963 | 2.7116 | 31.7427 | 9.8558 | 28.6398 | 28.6626 | 14.3049 | 4 | | 2.9616 | 2.5825 | 33.8958 | 11.4084 | 30.4746 | 30.5051 | 13.8512 | 5 | | 2.8430 | 2.4661 | 34.9508 | 12.9830 | 31.7061 | 31.7622 | 13.6707 | 6 | | 2.7381 | 2.3567 | 36.7926 | 14.5760 | 33.1279 | 33.1544 | 13.4049 | 7 | | 2.6359 | 2.2565 | 36.4180 | 15.1004 | 33.2257 | 33.2507 | 13.8537 | 8 | | 2.5543 | 2.1625 | 38.5778 | 17.1077 | 35.0027 | 35.0407 | 13.6 | 9 | | 2.4596 | 2.0833 | 38.6852 | 17.2964 | 35.4659 | 35.4887 | 13.6902 | 10 | | 2.3775 | 1.9966 | 40.1427 | 19.2079 | 36.3993 | 36.4188 | 13.4610 | 11 | | 2.3072 | 1.9227 | 40.9101 | 19.7985 | 36.9402 | 36.9218 | 14.0049 | 12 | | 2.2272 | 1.8442 | 42.0126 | 20.7988 | 37.9708 | 37.9803 | 13.8780 | 13 | | 2.1612 | 1.7821 | 42.8467 | 22.2188 | 39.2833 | 39.2590 | 13.7195 | 14 | | 2.1033 | 1.7130 | 44.1141 | 23.5104 | 40.0780 | 40.0617 | 14.1171 | 15 | | 2.0401 | 1.6523 | 44.4919 | 24.3293 | 40.8943 | 40.8605 | 13.5756 | 16 | | 1.9850 | 1.5952 | 44.9013 | 24.5225 | 41.0805 | 41.0159 | 13.3683 | 17 | | 1.9253 | 1.5343 | 45.4848 | 25.9386 | 42.1817 | 42.1357 | 13.7902 | 18 | | 1.8702 | 1.4831 | 46.2436 | 26.6188 | 42.7423 | 42.6771 | 13.5220 | 19 | ### Framework versions - Transformers 4.39.3 - TensorFlow 2.15.0 - Datasets 2.18.0 - Tokenizers 0.15.2