--- license: apache-2.0 base_model: google/flan-t5-small tags: - generated_from_trainer metrics: - rouge model-index: - name: flant5-small results: [] --- # flant5-small This model is a fine-tuned version of [google/flan-t5-small](https://huggingface.co/google/flan-t5-small) on an unknown dataset. It achieves the following results on the evaluation set: - Loss: 0.2540 - Rouge1: 39.4088 - Rouge2: 17.6509 - Rougel: 34.241 - Rougelsum: 36.3257 - Gen Len: 19.97 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 4 - eval_batch_size: 4 - seed: 42 - gradient_accumulation_steps: 2 - total_train_batch_size: 8 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 30 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| | 0.3676 | 1.0 | 1557 | 0.2753 | 36.6587 | 14.0807 | 31.2838 | 33.1779 | 19.942 | | 0.3135 | 2.0 | 3115 | 0.2658 | 37.8343 | 15.2461 | 32.2261 | 34.1553 | 19.97 | | 0.2992 | 3.0 | 4672 | 0.2596 | 38.3851 | 15.6982 | 32.5124 | 34.5772 | 19.942 | | 0.2888 | 4.0 | 6230 | 0.2559 | 37.6648 | 15.1146 | 32.1953 | 34.2139 | 19.94 | | 0.281 | 5.0 | 7787 | 0.2549 | 38.3654 | 15.8444 | 32.775 | 34.9156 | 19.952 | | 0.2734 | 6.0 | 9345 | 0.2533 | 38.7474 | 16.0237 | 33.155 | 35.3048 | 19.954 | | 0.2679 | 7.0 | 10902 | 0.2529 | 38.7094 | 16.1904 | 33.2149 | 35.2449 | 19.96 | | 0.2619 | 8.0 | 12460 | 0.2528 | 39.034 | 16.4682 | 33.7757 | 35.82 | 19.968 | | 0.2576 | 9.0 | 14017 | 0.2528 | 38.769 | 16.5015 | 33.3685 | 35.4211 | 19.948 | | 0.253 | 10.0 | 15575 | 0.2523 | 38.5811 | 16.3423 | 33.2559 | 35.2143 | 19.956 | | 0.2494 | 11.0 | 17132 | 0.2516 | 38.7084 | 16.5171 | 33.4486 | 35.5503 | 19.958 | | 0.2456 | 12.0 | 18690 | 0.2514 | 38.3763 | 16.2338 | 33.1431 | 34.8647 | 19.964 | | 0.2419 | 13.0 | 20247 | 0.2520 | 38.455 | 16.2491 | 32.9546 | 35.0263 | 19.972 | | 0.2388 | 14.0 | 21805 | 0.2514 | 38.9372 | 17.1821 | 33.6449 | 35.5621 | 19.97 | | 0.2363 | 15.0 | 23362 | 0.2530 | 38.9104 | 16.742 | 33.5194 | 35.3391 | 19.976 | | 0.2336 | 16.0 | 24920 | 0.2519 | 38.8698 | 16.9396 | 33.7987 | 35.6173 | 19.958 | | 0.2313 | 17.0 | 26477 | 0.2518 | 38.8774 | 17.0545 | 33.7151 | 35.6844 | 19.97 | | 0.229 | 18.0 | 28035 | 0.2518 | 38.7073 | 16.7039 | 33.4976 | 35.4177 | 19.964 | | 0.2272 | 19.0 | 29592 | 0.2522 | 39.0868 | 16.948 | 33.8953 | 35.8788 | 19.964 | | 0.2252 | 20.0 | 31150 | 0.2527 | 38.7854 | 16.9882 | 33.8017 | 35.6314 | 19.968 | | 0.2234 | 21.0 | 32707 | 0.2527 | 38.9196 | 17.1419 | 33.9139 | 35.8599 | 19.97 | | 0.2217 | 22.0 | 34265 | 0.2532 | 38.9227 | 17.0561 | 33.8032 | 35.6876 | 19.968 | | 0.2206 | 23.0 | 35822 | 0.2521 | 39.5234 | 17.6253 | 34.2157 | 36.2645 | 19.962 | | 0.2198 | 24.0 | 37380 | 0.2532 | 39.6108 | 17.8336 | 34.3222 | 36.3369 | 19.964 | | 0.2184 | 25.0 | 38937 | 0.2533 | 39.3052 | 17.2967 | 33.9684 | 36.0207 | 19.972 | | 0.2173 | 26.0 | 40495 | 0.2536 | 39.019 | 17.3083 | 34.0561 | 35.9826 | 19.972 | | 0.2166 | 27.0 | 42052 | 0.2532 | 39.2553 | 17.6306 | 34.1763 | 36.1479 | 19.974 | | 0.2159 | 28.0 | 43610 | 0.2539 | 39.3659 | 17.6526 | 34.276 | 36.2856 | 19.972 | | 0.2154 | 29.0 | 45167 | 0.2543 | 39.3868 | 17.5653 | 34.2637 | 36.2704 | 19.974 | | 0.2152 | 29.99 | 46710 | 0.2540 | 39.4088 | 17.6509 | 34.241 | 36.3257 | 19.97 | ### Framework versions - Transformers 4.36.1 - Pytorch 2.1.2 - Datasets 2.19.2 - Tokenizers 0.15.2