flant5-small-1
This model is a fine-tuned version of google/flan-t5-small on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.1819
- Rouge1: 31.6091
- Rouge2: 12.3227
- Rougel: 27.7285
- Rougelsum: 28.7294
- Gen Len: 18.8933
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 8
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
0.2643 | 1.0 | 4632 | 0.1968 | 30.5996 | 11.8855 | 26.9671 | 28.0068 | 19.0619 |
0.2196 | 2.0 | 9265 | 0.1908 | 30.8354 | 12.4754 | 27.3122 | 28.332 | 18.9259 |
0.2122 | 3.0 | 13897 | 0.1869 | 30.5699 | 11.9517 | 26.9246 | 28.005 | 18.9308 |
0.2074 | 4.0 | 18530 | 0.1851 | 30.9689 | 11.878 | 27.2442 | 28.2223 | 18.8803 |
0.2041 | 5.0 | 23162 | 0.1842 | 31.1178 | 12.3175 | 27.345 | 28.4092 | 18.8746 |
0.2013 | 6.0 | 27795 | 0.1830 | 31.4978 | 12.3755 | 27.697 | 28.7429 | 18.8664 |
0.1993 | 7.0 | 32427 | 0.1825 | 31.6174 | 12.427 | 27.6361 | 28.7067 | 18.8453 |
0.1977 | 8.0 | 37060 | 0.1822 | 32.0368 | 12.6267 | 28.1028 | 29.1753 | 18.9235 |
0.1971 | 9.0 | 41692 | 0.1820 | 31.6689 | 12.4617 | 27.8217 | 28.8224 | 18.9178 |
0.1965 | 10.0 | 46320 | 0.1819 | 31.6091 | 12.3227 | 27.7285 | 28.7294 | 18.8933 |
Framework versions
- Transformers 4.36.1
- Pytorch 2.1.2
- Datasets 2.20.0
- Tokenizers 0.15.2
- Downloads last month
- 1