Edit model card

Hyperparameters

learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True

Training Output

global_step=4248,
training_loss=2.172659089111788,
metrics={'train_runtime': 3371.7912,
'train_samples_per_second': 17.633,
'train_steps_per_second': 1.26,
'total_flos': 1.2884303701396685e+17,
'train_loss': 2.172659089111788,
'epoch': 3.0}

Training Results

Epoch Training Loss Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len
1 2.318000 2.079500 0.128100 0.046700 0.104200 0.104200 0.001100 20.000000
2 2.130000 2.043523 0.130200 0.047400 0.105400 0.105300 0.001300 20.000000
3 2.047100 2.034664 0.130700 0.047800 0.105900 0.105900 0.001300 20.000000
Downloads last month
4
Safetensors
Model size
406M params
Tensor type
F32
·

Dataset used to train usakha/Bart_MedPaper_model