led-base-16384-finetune-xsum
This model is a fine-tuned version of allenai/led-base-16384 on the xsum dataset. It achieves the following results on the evaluation set:
- Loss: 3.3325
- Rouge1: 31.3157
- Rouge2: 9.2183
- Rougel: 23.7641
- Rougelsum: 23.8202
- Gen Len: 19.89
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
No log | 1.0 | 125 | 2.6311 | 32.5653 | 10.8601 | 25.3811 | 25.5187 | 19.84 |
No log | 2.0 | 250 | 2.7544 | 31.6321 | 9.9595 | 25.0264 | 25.0779 | 19.85 |
No log | 3.0 | 375 | 2.8261 | 32.0246 | 10.1415 | 25.2121 | 25.2632 | 19.89 |
0.1515 | 4.0 | 500 | 2.9240 | 31.6961 | 11.1892 | 25.0684 | 25.1019 | 19.92 |
0.1515 | 5.0 | 625 | 3.0229 | 31.1022 | 9.294 | 24.3075 | 24.309 | 19.9 |
0.1515 | 6.0 | 750 | 3.0900 | 31.7063 | 10.2344 | 25.1885 | 25.3359 | 19.89 |
0.1515 | 7.0 | 875 | 3.0958 | 31.6973 | 10.2856 | 25.5433 | 25.6242 | 19.91 |
0.0437 | 8.0 | 1000 | 3.1248 | 30.9445 | 10.3904 | 24.0861 | 24.109 | 19.91 |
0.0437 | 9.0 | 1125 | 3.1542 | 31.4694 | 9.4087 | 24.3248 | 24.4039 | 19.97 |
0.0437 | 10.0 | 1250 | 3.1986 | 30.428 | 9.6657 | 24.2568 | 24.4035 | 19.86 |
0.0437 | 11.0 | 1375 | 3.2040 | 32.3325 | 9.8754 | 25.117 | 25.1563 | 19.95 |
0.0229 | 12.0 | 1500 | 3.2044 | 30.8435 | 8.6959 | 23.4129 | 23.5211 | 19.99 |
0.0229 | 13.0 | 1625 | 3.2419 | 31.8807 | 9.6734 | 24.5748 | 24.6672 | 19.96 |
0.0229 | 14.0 | 1750 | 3.2926 | 31.8181 | 9.5238 | 24.3606 | 24.4569 | 19.88 |
0.0229 | 15.0 | 1875 | 3.2935 | 30.7551 | 8.9042 | 23.9581 | 24.1074 | 19.98 |
0.0107 | 16.0 | 2000 | 3.3219 | 31.3919 | 9.3308 | 24.1432 | 24.2162 | 19.93 |
0.0107 | 17.0 | 2125 | 3.3167 | 31.7918 | 9.4813 | 23.9672 | 24.0244 | 19.9 |
0.0107 | 18.0 | 2250 | 3.3281 | 31.0624 | 9.3608 | 23.6247 | 23.6658 | 19.89 |
0.0107 | 19.0 | 2375 | 3.3248 | 31.7832 | 9.5257 | 23.9738 | 24.0255 | 19.96 |
0.0063 | 20.0 | 2500 | 3.3325 | 31.3157 | 9.2183 | 23.7641 | 23.8202 | 19.89 |
Framework versions
- Transformers 4.30.2
- Pytorch 2.0.1+cu118
- Datasets 2.13.1
- Tokenizers 0.13.3
- Downloads last month
- 4
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.