Edit model card

switch-base-16-samsum-ba16-lr1e-4-top-4-choose-1

This model is a fine-tuned version of google/switch-base-16 on the samsum samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5674
  • Rouge1: 50.9015
  • Rouge2: 26.4267
  • Rougel: 42.4807
  • Rougelsum: 47.065
  • Gen Len: 24.2139

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.4108 0.4343 400 1.9721 42.5812 19.2812 35.4748 39.2541 19.2506
2.0404 0.8686 800 1.7618 46.3109 22.6628 39.0055 42.5976 19.055
1.9542 1.3029 1200 1.6832 47.8204 24.0498 40.4763 44.3711 20.6822
1.8495 1.7372 1600 1.6298 47.4526 24.2324 39.9469 43.7808 19.72
1.6194 2.1716 2000 1.6185 49.8848 25.1915 41.8312 46.4885 25.4976
1.652 2.6059 2400 1.6123 48.5854 24.7701 40.9614 44.9415 22.4682
1.4582 3.0402 2800 1.5927 49.7032 25.2983 41.5462 45.8574 22.9254
1.5009 3.4745 3200 1.5974 50.3499 26.122 42.1925 46.5237 24.0758
1.4458 3.9088 3600 1.5765 51.2501 26.6203 42.8644 47.4187 24.0061
1.3603 4.3431 4000 1.6234 51.5462 27.0733 43.1304 47.7257 25.4377
1.4493 4.7774 4400 1.5644 50.8852 26.6632 42.6825 47.0773 25.1797

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
1.07B params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train taehyunzzz/switch-base-16-samsum-ba16-lr1e-04-top-4-choose-1

Collection including taehyunzzz/switch-base-16-samsum-ba16-lr1e-04-top-4-choose-1

Evaluation results