Edit model card

switch-base-8-samsum-ba16-lr0.0001-top-1

This model is a fine-tuned version of google/switch-base-8 on the samsum samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5773
  • Rouge1: 50.2274
  • Rouge2: 26.1297
  • Rougel: 41.6305
  • Rougelsum: 46.212
  • Gen Len: 26.1064

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.5688 0.4343 400 2.0174 39.045 18.0039 32.4777 36.2922 20.7787
2.054 0.8686 800 1.7355 45.9504 22.2553 37.9856 42.4096 24.9254
1.9326 1.3029 1200 1.6762 46.4474 22.8747 39.0668 42.8255 20.0905
1.8121 1.7372 1600 1.6212 47.4383 23.9879 39.9156 44.1151 21.1174
1.6303 2.1716 2000 1.6068 49.5797 25.5351 41.2855 45.9999 24.8362
1.6817 2.6059 2400 1.5734 49.0904 24.9926 41.2085 45.4779 23.0905
1.4335 3.0402 2800 1.5943 49.5091 25.7276 41.7665 45.7573 22.1553
1.5042 3.4745 3200 1.5807 49.1947 25.6961 41.2511 45.5553 22.6149
1.4447 3.9088 3600 1.5747 50.1246 26.1223 41.8475 46.3095 25.5709
1.3638 4.3431 4000 1.6004 50.655 26.3528 42.3721 46.8937 25.0685
1.4508 4.7774 4400 1.5741 49.9176 25.9264 41.7646 46.0881 24.879

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
2
Safetensors
Model size
619M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train taehyunzzz/switch-base-8-samsum-ba16-lr1e-04-top-1

Collection including taehyunzzz/switch-base-8-samsum-ba16-lr1e-04-top-1

Evaluation results