Edit model card

Hyperparameters

learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True

Training Output

global_step=3003,
training_loss=1.8524150695953217,
metrics={'train_runtime': 2319.7329,
'train_samples_per_second': 18.122,
'train_steps_per_second': 1.295,
'total_flos': 9.110291036818637e+16,
'train_loss': 1.8524150695953217,
'epoch': 3.0}

Training Results

Epoch Training Loss Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len
1 1.969100 1.756651 0.159100 0.088300 0.138800 0.138900 0.001600 20.000000
2 1.794000 1.699691 0.158500 0.090300 0.139500 0.139600 0.001400 20.000000
3 1.713700 1.687554 0.162700 0.091900 0.141800 0.141900 0.001660 20.000000
Downloads last month
11
Safetensors
Model size
406M params
Tensor type
F32
·

Dataset used to train usakha/Bart_GovReport_model