long-t5-tglobal-xl-16384-booksci-summary-plos-10k

This model is a fine-tuned version of pszemraj/long-t5-tglobal-xl-16384-book-summary on the pszemraj/scientific_lay_summarisation-plos-norm dataset. It achieves the following results on the evaluation set:

Loss: 1.5041
Rouge1: 44.3203
Rouge2: 11.0576
Rougel: 22.7584
Rougelsum: 40.1462
Gen Len: 256.66

Model description

Another test of further fine-tuning booksum-based models, this one fine-tuned on the PLOS subset of lay-summaries for about 10k examples input, to make it roughly equivalent to this checkpoint fine-tuned on the ELIFE subset for two epochs (also around 10k examples).

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 1
eval_batch_size: 1
seed: 165
gradient_accumulation_steps: 8
total_train_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.02
num_epochs: 1.0

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len
1.7715	0.28	350	1.5310	43.4729	10.4616	22.1928	39.505	260.87
1.9307	0.56	700	1.5102	44.1634	10.9336	22.3896	40.2939	253.58
1.2981	0.84	1050	1.5046	44.2728	10.8455	22.4122	40.3019	261.29

pszemraj
/

long-t5-tglobal-xl-16384-booksci-summary-plos-10k