--- license: apache-2.0 tags: - generated_from_trainer metrics: - rouge model-index: - name: t5-v1_1-base-gramatika-final-e8-b16 results: [] --- # t5-v1_1-base-gramatika-final-e8-b16 This model is a fine-tuned version of [google/t5-v1_1-base](https://huggingface.co/google/t5-v1_1-base) on the None dataset. It achieves the following results on the evaluation set: - Loss: 0.1723 - Rouge1: 43.8331 - Rouge2: 34.7609 - Rougel: 43.5803 - Rougelsum: 43.5467 - Gen Len: 18.9287 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.001 - train_batch_size: 16 - eval_batch_size: 16 - seed: 42 - optimizer: Adafactor - lr_scheduler_type: linear - num_epochs: 8 ### Training results | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:| | 2.6434 | 0.37 | 300 | 0.4530 | 38.4418 | 26.1528 | 37.8295 | 37.7894 | 18.9434 | | 0.5551 | 0.73 | 600 | 0.3368 | 39.883 | 28.2471 | 39.2883 | 39.2822 | 18.9345 | | 0.4523 | 1.1 | 900 | 0.2959 | 40.3084 | 29.2298 | 39.8742 | 39.8747 | 18.9350 | | 0.4165 | 1.46 | 1200 | 0.2610 | 41.0422 | 30.4902 | 40.6542 | 40.6354 | 18.9350 | | 0.3196 | 1.83 | 1500 | 0.2292 | 41.6111 | 31.1549 | 41.2572 | 41.2477 | 18.9355 | | 0.2718 | 2.2 | 1800 | 0.2153 | 41.9295 | 31.6902 | 41.5757 | 41.5624 | 18.9334 | | 0.2446 | 2.56 | 2100 | 0.2055 | 42.2918 | 32.4861 | 42.0541 | 42.0135 | 18.9324 | | 0.2301 | 2.93 | 2400 | 0.2232 | 42.6172 | 33.0243 | 42.3474 | 42.3224 | 18.9334 | | 0.1997 | 3.29 | 2700 | 0.1859 | 42.8442 | 33.4479 | 42.6294 | 42.6121 | 18.9350 | | 0.186 | 3.66 | 3000 | 0.1816 | 42.9407 | 33.5872 | 42.7248 | 42.7125 | 18.9277 | | 0.1736 | 4.02 | 3300 | 0.1771 | 43.1994 | 34.0513 | 43.0334 | 42.9982 | 18.9308 | | 0.1439 | 4.39 | 3600 | 0.1818 | 43.2146 | 33.997 | 43.0221 | 42.9893 | 18.9282 | | 0.1429 | 4.76 | 3900 | 0.1732 | 43.4458 | 34.377 | 43.3072 | 43.26 | 18.9277 | | 0.132 | 5.12 | 4200 | 0.1795 | 43.7156 | 34.6069 | 43.4982 | 43.481 | 18.9292 | | 0.1151 | 5.49 | 4500 | 0.1767 | 43.7618 | 34.7345 | 43.5565 | 43.5181 | 18.9287 | | 0.1127 | 5.85 | 4800 | 0.1723 | 43.8331 | 34.7609 | 43.5803 | 43.5467 | 18.9287 | | 0.0994 | 6.22 | 5100 | 0.1757 | 43.8866 | 34.9216 | 43.641 | 43.6214 | 18.9287 | | 0.0892 | 6.59 | 5400 | 0.1779 | 43.9415 | 34.9905 | 43.7332 | 43.7063 | 18.9292 | | 0.0914 | 6.95 | 5700 | 0.1725 | 43.9439 | 35.0456 | 43.7419 | 43.7266 | 18.9298 | | 0.0772 | 7.32 | 6000 | 0.1776 | 44.1132 | 35.3173 | 43.9301 | 43.9135 | 18.9287 | | 0.0755 | 7.68 | 6300 | 0.1778 | 44.0494 | 35.3179 | 43.8797 | 43.8587 | 18.9282 | ### Framework versions - Transformers 4.30.1 - Pytorch 1.11.0a0+b6df043 - Datasets 2.12.0 - Tokenizers 0.13.3