Edit model card

switch-base-8-samsum-ba16-lr5e-05-top-1

This model is a fine-tuned version of google/switch-base-8 on the samsum samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5938
  • Rouge1: 51.1516
  • Rouge2: 27.0867
  • Rougel: 42.9355
  • Rougelsum: 47.1843
  • Gen Len: 21.3032

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
3.0575 0.5429 500 2.2660 36.3629 16.0253 29.5441 33.347 17.2396
2.3175 1.0858 1000 1.7944 44.0034 20.5273 36.7621 40.5462 20.0098
2.0496 1.6287 1500 1.6804 46.2384 22.3121 38.5952 42.4289 19.9853
1.8365 2.1716 2000 1.6295 48.1931 23.8938 40.0649 44.5492 21.8142
1.8642 2.7144 2500 1.5847 48.0128 24.3988 40.5306 44.4491 19.813
1.6646 3.2573 3000 1.5817 48.8709 25.3067 41.5278 45.3437 20.5355
1.7515 3.8002 3500 1.5484 48.8515 25.1368 41.1741 45.5125 21.5477
1.578 4.3431 4000 1.5730 49.8513 25.9075 42.014 46.2678 21.8399
1.6383 4.8860 4500 1.5492 48.9509 25.1085 41.0289 45.1148 21.0147
1.4581 5.4289 5000 1.5503 50.168 26.1337 42.2038 46.4867 21.302
1.49 5.9718 5500 1.5279 49.8392 26.0813 41.6588 46.0955 21.8411
1.3401 6.5147 6000 1.5628 50.6382 26.2476 42.0006 46.7203 22.3924
1.26 7.0575 6500 1.5656 50.2393 26.7223 42.4855 46.6651 21.1834
1.2895 7.6004 7000 1.5618 50.5139 26.858 42.4964 46.7989 21.7555
1.1783 8.1433 7500 1.5804 50.8842 27.0902 42.6885 47.2009 21.7531
1.226 8.6862 8000 1.5716 51.2014 27.0081 42.5835 47.3725 22.7482
1.1623 9.2291 8500 1.5912 50.8301 26.924 42.4375 46.9359 22.0892
1.1607 9.7720 9000 1.5926 50.7159 26.7812 42.4478 46.9283 21.7311

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
619M params
Tensor type
F32
·
Inference API
This model can be loaded on Inference API (serverless).

Finetuned from

Dataset used to train taehyunzzz/switch-base-8-samsum-ba16-lr5e-05-top-1

Evaluation results