--- license: apache-2.0 base_model: google/flan-t5-small tags: - generated_from_trainer metrics: - rouge model-index: - name: t5-summarization-zero-shot-headers-and-better-prompt-enriched results: [] --- # t5-summarization-zero-shot-headers-and-better-prompt-enriched This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 2.3132 - Rouge: {'rouge1': 0.426, 'rouge2': 0.195, 'rougeL': 0.2024, 'rougeLsum': 0.2024} - Bert Score: 0.877 - Bleurt 20: -0.8149 - Gen Len: 13.66 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 7 - eval_batch_size: 7 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_ratio: 0.1 - num_epochs: 20 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge | Bert Score | Bleurt 20 | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:---------------------------------------------------------------------------:|:----------:|:---------:|:-------:| | 2.8785 | 1.0 | 172 | 2.6476 | {'rouge1': 0.462, 'rouge2': 0.1848, 'rougeL': 0.1845, 'rougeLsum': 0.1845} | 0.8707 | -0.8319 | 15.17 | | 2.6366 | 2.0 | 344 | 2.4685 | {'rouge1': 0.4501, 'rouge2': 0.1849, 'rougeL': 0.1933, 'rougeLsum': 0.1933} | 0.872 | -0.8531 | 14.545 | | 2.3822 | 3.0 | 516 | 2.3766 | {'rouge1': 0.4217, 'rouge2': 0.1759, 'rougeL': 0.1867, 'rougeLsum': 0.1867} | 0.8719 | -0.8998 | 13.675 | | 2.2235 | 4.0 | 688 | 2.3262 | {'rouge1': 0.4396, 'rouge2': 0.1832, 'rougeL': 0.1867, 'rougeLsum': 0.1867} | 0.8715 | -0.8847 | 14.38 | | 2.0765 | 5.0 | 860 | 2.3122 | {'rouge1': 0.4143, 'rouge2': 0.1769, 'rougeL': 0.1907, 'rougeLsum': 0.1907} | 0.875 | -0.9206 | 13.37 | | 2.0141 | 6.0 | 1032 | 2.2993 | {'rouge1': 0.4257, 'rouge2': 0.1867, 'rougeL': 0.1943, 'rougeLsum': 0.1943} | 0.8773 | -0.8751 | 13.555 | | 1.9087 | 7.0 | 1204 | 2.2855 | {'rouge1': 0.4236, 'rouge2': 0.1858, 'rougeL': 0.1895, 'rougeLsum': 0.1895} | 0.8774 | -0.87 | 13.255 | | 1.868 | 8.0 | 1376 | 2.2795 | {'rouge1': 0.4298, 'rouge2': 0.1896, 'rougeL': 0.1956, 'rougeLsum': 0.1956} | 0.877 | -0.8837 | 13.65 | | 1.8063 | 9.0 | 1548 | 2.2802 | {'rouge1': 0.4427, 'rouge2': 0.1965, 'rougeL': 0.2011, 'rougeLsum': 0.2011} | 0.8779 | -0.8358 | 13.965 | | 1.7161 | 10.0 | 1720 | 2.2685 | {'rouge1': 0.4146, 'rouge2': 0.1828, 'rougeL': 0.1918, 'rougeLsum': 0.1918} | 0.8795 | -0.8725 | 13.155 | | 1.7027 | 11.0 | 1892 | 2.2824 | {'rouge1': 0.423, 'rouge2': 0.1871, 'rougeL': 0.1958, 'rougeLsum': 0.1958} | 0.8781 | -0.8476 | 13.49 | | 1.6575 | 12.0 | 2064 | 2.2888 | {'rouge1': 0.4231, 'rouge2': 0.1847, 'rougeL': 0.1939, 'rougeLsum': 0.1939} | 0.878 | -0.8648 | 13.3 | | 1.6046 | 13.0 | 2236 | 2.2946 | {'rouge1': 0.4387, 'rouge2': 0.1942, 'rougeL': 0.1987, 'rougeLsum': 0.1987} | 0.8771 | -0.8336 | 13.835 | | 1.5638 | 14.0 | 2408 | 2.2961 | {'rouge1': 0.4225, 'rouge2': 0.1864, 'rougeL': 0.1973, 'rougeLsum': 0.1973} | 0.8774 | -0.8456 | 13.345 | | 1.6015 | 15.0 | 2580 | 2.2937 | {'rouge1': 0.429, 'rouge2': 0.1947, 'rougeL': 0.2007, 'rougeLsum': 0.2007} | 0.8777 | -0.8402 | 13.655 | | 1.5146 | 16.0 | 2752 | 2.3077 | {'rouge1': 0.4208, 'rouge2': 0.1869, 'rougeL': 0.1978, 'rougeLsum': 0.1978} | 0.8751 | -0.8221 | 13.695 | | 1.5421 | 17.0 | 2924 | 2.3094 | {'rouge1': 0.4263, 'rouge2': 0.1938, 'rougeL': 0.202, 'rougeLsum': 0.202} | 0.8759 | -0.8207 | 13.67 | | 1.5328 | 18.0 | 3096 | 2.3114 | {'rouge1': 0.4306, 'rouge2': 0.1927, 'rougeL': 0.2006, 'rougeLsum': 0.2006} | 0.8758 | -0.8284 | 13.755 | | 1.5181 | 19.0 | 3268 | 2.3128 | {'rouge1': 0.4298, 'rouge2': 0.196, 'rougeL': 0.1997, 'rougeLsum': 0.1997} | 0.8764 | -0.8211 | 13.77 | | 1.4926 | 20.0 | 3440 | 2.3132 | {'rouge1': 0.426, 'rouge2': 0.195, 'rougeL': 0.2024, 'rougeLsum': 0.2024} | 0.877 | -0.8149 | 13.66 | ### Framework versions - Transformers 4.35.2 - Pytorch 2.1.0+cu121 - Datasets 2.16.1 - Tokenizers 0.15.0