switch-base-8-samsum
This model is a fine-tuned version of google/switch-base-8 on the samsum dataset. It achieves the following results on the evaluation set:
- Loss: 1.4462
- Rouge1: 47.7753
- Rouge2: 25.0191
- Rougel: 40.5513
- Rougelsum: 44.1931
- Gen Len: 17.0037
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5.0
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
---|---|---|---|---|---|---|---|---|
2.1458 | 0.5429 | 1000 | 1.6802 | 42.9604 | 20.2079 | 35.7555 | 39.8487 | 17.4401 |
1.8151 | 1.0858 | 2000 | 1.5610 | 45.6037 | 22.3422 | 38.1902 | 42.2279 | 17.3631 |
1.727 | 1.6287 | 3000 | 1.5148 | 46.336 | 23.6519 | 39.0026 | 42.6412 | 16.9939 |
1.5627 | 2.1716 | 4000 | 1.4818 | 46.9902 | 23.6438 | 39.6679 | 43.3643 | 17.1944 |
1.6123 | 2.7144 | 5000 | 1.4564 | 46.6886 | 23.7798 | 39.5993 | 43.1788 | 16.6455 |
1.4284 | 3.2573 | 6000 | 1.4557 | 47.4032 | 24.8955 | 40.2679 | 43.9794 | 17.1711 |
1.4641 | 3.8002 | 7000 | 1.4513 | 47.3726 | 24.7001 | 40.3318 | 44.2062 | 17.2689 |
1.373 | 4.3431 | 8000 | 1.4473 | 47.5663 | 24.8397 | 40.1119 | 44.0327 | 17.0281 |
1.3706 | 4.8860 | 9000 | 1.4462 | 47.7753 | 25.0191 | 40.5513 | 44.1931 | 17.0037 |
Framework versions
- Transformers 4.41.2
- Pytorch 2.2.0
- Datasets 2.14.5
- Tokenizers 0.19.1
- Downloads last month
- 0