Edit model card

Hyperparameters

learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True

Training Output

global_step=4248,
training_loss=2.4160910424988598,
metrics={'train_runtime': 14565.4519,
'train_samples_per_second': 4.082,
'train_steps_per_second': 0.292,
'total_flos': 1.7179021728232243e+17,
'train_loss': 2.4160910424988598,
'epoch': 3.0}

Training Results

Epoch Training Loss Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len
1 2.467100 2.303269 0.410900 0.136200 0.235900 0.235900 0.465700 182.332800
2 2.386700 2.281062 0.426300 0.142300 0.246800 0.246700 0.525200 143.990900
3 2.362000 2.274931 0.428400 0.143800 0.248300 0.248200 0.532000 139.585900
Downloads last month
7
Safetensors
Model size
571M params
Tensor type
F32
·

Dataset used to train usakha/Pegasus_MedPaper_model