Edit model card

Hyperparameters

learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True

Training Output

global_step=7710
training_loss=2.1297076629757417
metrics={'train_runtime': 6059.0418, 
'train_samples_per_second': 17.813, 
'train_steps_per_second': 1.272, 
'total_flos': 2.3389776681055027e+17, 
'train_loss': 2.1297076629757417, 
'epoch': 3.0}

Training Results

Epoch Training Loss Validation Loss Rouge1 Rouge2 Rougel Rougelsum Bleu Gen Len
1 2.223100 2.038599 0.147400 0.054800 0.113500 0.113500 0.001400 20.000000
2 2.078100 2.009619 0.152900 0.057800 0.117000 0.117000 0.001600 20.000000
3 1.989000 2.006006 0.152900 0.057300 0.116700 0.116700 0.001700 20.000000
Downloads last month
5
Safetensors
Model size
406M params
Tensor type
F32
·

Dataset used to train usakha/Bart_multiNews_model