Edit model card

Hyperparameters

learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True

Training Output

global_step=3003,
training_loss=2.0113779983241042,
metrics={'train_runtime': 12268.4376,
'train_samples_per_second': 3.427,
'train_steps_per_second': 0.245,
'total_flos': 1.2147019450889011e+17,
'train_loss': 2.0113779983241042,
'epoch': 3.0}

Training Results

Epoch Training Loss Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len
1 2.035800 1.906599 0.365400 0.150500 0.243200 0.243500 0.366300 227.230300
2 1.976100 1.878923 0.393700 0.167800 0.263500 0.263800 0.423600 193.114200
3 1.956800 1.871454 0.409300 0.175100 0.273400 0.273600 0.457000 172.294500
Downloads last month
6
Safetensors
Model size
571M params
Tensor type
F32
·

Dataset used to train usakha/Pegasus_GovReport_model