Edit model card

switch-base-8-samsum-ba16-lr1e-04-top-2-choose-1

This model is a fine-tuned version of google/switch-base-8 on the samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5912
  • Rouge1: 50.0097
  • Rouge2: 25.9731
  • Rougel: 41.7903
  • Rougelsum: 46.3722
  • Gen Len: 22.2347

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
2.2794 0.5429 500 1.9027 43.093 20.2653 36.0953 39.9306 19.1553
1.9316 1.0858 1000 1.7250 46.9261 22.2483 38.479 43.0242 22.0636
1.8271 1.6287 1500 1.6592 47.8811 24.297 39.9511 44.3462 21.8301
1.6252 2.1716 2000 1.6083 48.5006 24.4846 40.6387 44.8987 21.7775
1.6832 2.7144 2500 1.5831 47.9987 24.2297 40.1482 44.3065 19.3594
1.4585 3.2573 3000 1.6166 49.9246 25.9232 41.8637 46.2275 22.4242
1.5714 3.8002 3500 1.5817 49.4227 25.4827 41.4224 46.0025 22.3423
1.3661 4.3431 4000 1.6149 50.1332 26.002 42.11 46.4761 22.1222
1.4259 4.8860 4500 1.5912 50.0097 25.9731 41.7903 46.3722 22.2347

Framework versions

  • Transformers 4.41.2
  • Pytorch 2.1.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
5
Safetensors
Model size
619M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from

Dataset used to train taehyunzzz/switch-base-8-samsum-ba16-lr1e-04-top-2-choose-1

Evaluation results