--- license: mit tags: - generated_from_trainer datasets: - it5/datasets metrics: - rouge model-index: - name: it5-efficient-small-el32-fst-i2f-0.0003 results: - task: name: Summarization type: summarization dataset: name: it5/datasets fst type: it5/datasets args: fst metrics: - name: Rouge1 type: rouge value: 56.585 --- # it5-efficient-small-el32-fst-i2f-0.0003 This model is a fine-tuned version of [stefan-it/it5-efficient-small-el32](https://huggingface.co/stefan-it/it5-efficient-small-el32) on the it5/datasets fst dataset. It achieves the following results on the evaluation set: - Loss: 2.2160 - Rouge1: 56.585 - Rouge2: 36.9335 - Rougel: 53.7782 - Rougelsum: 53.7779 - Gen Len: 13.0891 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0003 - train_batch_size: 8 - eval_batch_size: 8 - seed: 42 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - num_epochs: 10.0 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:------:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| | 2.9377 | 0.35 | 5000 | 2.5157 | 54.6148 | 35.1518 | 51.8908 | 51.8957 | 12.8717 | | 2.803 | 0.7 | 10000 | 2.4086 | 55.641 | 36.1214 | 52.8683 | 52.8572 | 12.7513 | | 2.5483 | 1.05 | 15000 | 2.3420 | 55.6604 | 36.0085 | 52.9599 | 52.9433 | 12.7754 | | 2.4978 | 1.39 | 20000 | 2.3145 | 56.204 | 36.5896 | 53.338 | 53.3351 | 12.8804 | | 2.5383 | 1.74 | 25000 | 2.2697 | 56.1356 | 36.6963 | 53.3579 | 53.3664 | 12.795 | | 2.3368 | 2.09 | 30000 | 2.2603 | 56.0271 | 36.4249 | 53.3113 | 53.3272 | 12.7478 | | 2.371 | 2.44 | 35000 | 2.2328 | 56.5041 | 36.8718 | 53.8064 | 53.7995 | 12.8243 | | 2.3567 | 2.79 | 40000 | 2.2079 | 56.5318 | 36.9437 | 53.8359 | 53.8254 | 12.6851 | | 2.1753 | 3.14 | 45000 | 2.2168 | 56.3831 | 36.8896 | 53.6542 | 53.6708 | 12.67 | | 2.2069 | 3.48 | 50000 | 2.2055 | 56.7171 | 37.1665 | 53.9299 | 53.9259 | 12.8014 | | 2.2396 | 3.83 | 55000 | 2.1801 | 56.936 | 37.5465 | 54.1064 | 54.1125 | 12.7989 | | 2.0657 | 4.18 | 60000 | 2.1915 | 56.6312 | 37.1618 | 53.8646 | 53.8791 | 12.6987 | | 2.0806 | 4.53 | 65000 | 2.1809 | 56.6599 | 37.1282 | 53.8838 | 53.8781 | 12.715 | | 2.0933 | 4.88 | 70000 | 2.1771 | 56.5891 | 36.9461 | 53.8058 | 53.8087 | 12.6593 | | 1.9949 | 5.23 | 75000 | 2.1932 | 56.4956 | 36.9679 | 53.7634 | 53.7731 | 12.6723 | | 1.9954 | 5.57 | 80000 | 2.1813 | 56.4827 | 36.8319 | 53.6397 | 53.6399 | 12.6599 | | 1.9912 | 5.92 | 85000 | 2.1755 | 56.6723 | 37.0432 | 53.8339 | 53.8233 | 12.7534 | | 1.9068 | 6.27 | 90000 | 2.1849 | 56.6574 | 37.0691 | 53.9029 | 53.892 | 12.7037 | | 1.9173 | 6.62 | 95000 | 2.1787 | 56.5701 | 36.861 | 53.6855 | 53.6699 | 12.6467 | | 1.9131 | 6.97 | 100000 | 2.1862 | 56.7175 | 37.0749 | 53.8761 | 53.8794 | 12.7072 | | 1.8164 | 7.32 | 105000 | 2.1999 | 56.6104 | 37.0809 | 53.8098 | 53.8216 | 12.6364 | | 1.8489 | 7.66 | 110000 | 2.1945 | 56.6645 | 37.1267 | 53.9009 | 53.9008 | 12.5741 | | 1.82 | 8.01 | 115000 | 2.2075 | 56.6075 | 37.0359 | 53.8792 | 53.8833 | 12.6428 | | 1.772 | 8.36 | 120000 | 2.2067 | 56.4716 | 36.8675 | 53.6826 | 53.6742 | 12.6591 | | 1.7795 | 8.71 | 125000 | 2.2056 | 56.4112 | 36.9011 | 53.6554 | 53.6495 | 12.608 | | 1.72 | 9.06 | 130000 | 2.2197 | 56.4735 | 36.9255 | 53.6592 | 53.6463 | 12.6758 | | 1.7174 | 9.41 | 135000 | 2.2169 | 56.4209 | 36.8139 | 53.5778 | 53.5685 | 12.6568 | | 1.7466 | 9.75 | 140000 | 2.2165 | 56.3715 | 36.767 | 53.555 | 53.5468 | 12.6416 | ### Framework versions - Transformers 4.15.0 - Pytorch 1.10.0+cu102 - Datasets 1.17.0 - Tokenizers 0.10.3