Edit model card

Visualize in Weights & Biases

long-t5-scisumm-accelerate-v2

This model is a fine-tuned version of pszemraj/long-t5-tglobal-base-sci-simplify on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0496
  • Rouge1: 39.8436
  • Rouge2: 14.763
  • Rougel: 25.6676
  • Rougelsum: 36.482
  • Gen Len: 103.02

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.0771 0.9868 28 1.0594 39.2592 13.6097 25.1508 35.285 111.22
1.0193 1.9736 56 1.0530 39.1521 14.855 25.4706 35.3713 110.14
0.9545 2.9604 84 1.0496 39.8436 14.763 25.6676 36.482 103.02

Framework versions

  • Transformers 4.41.0.dev0
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.19.1
Downloads last month
206
Safetensors
Model size
248M params
Tensor type
F32
·

Finetuned from