long-t5-tglobal-xl-sci-simplify-elife

This model is a fine-tuned version of google/long-t5-tglobal-xl on the pszemraj/scientific_lay_summarisation-elife-norm dataset. It achieves the following results on the evaluation set:

Loss: 1.6666
Rouge1: 47.1446
Rouge2: 14.2158
Rougel: 23.3524
Rougelsum: 44.6063
Gen Len: 431.22

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

the pszemraj/scientific_lay_summarisation-elife-norm dataset, input 16384 tokens then truncate, output 1024 tokens then truncate.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 8e-05
train_batch_size: 1
eval_batch_size: 1
seed: 6963
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.02
num_epochs: 2.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.7959	1.0	543	1.6770	44.4187	12.6752	22.4669	41.944	456.33
1.7578	2.0	1086	1.6666	47.1446	14.2158	23.3524	44.6063	431.22

pszemraj
/

long-t5-tglobal-xl-sci-simplify-elife