Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=3003,
training_loss=2.5178213735600132,
metrics={'train_runtime': 8703.174,
'train_samples_per_second': 4.83,
'train_steps_per_second': 0.345,
'total_flos': 9.272950245870797e+16,
'train_loss': 2.5178213735600132,
'epoch': 3.0}
Training Results
Epoch | Training Loss | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu | Gen Len |
---|---|---|---|---|---|---|---|---|
1 | 2.661100 | 2.469111 | 0.451300 | 0.185200 | 0.279000 | 0.278900 | 0.553300 | 141.720300 |
2 | 2.434100 | 2.403647 | 0.456900 | 0.192800 | 0.284500 | 0.284500 | 0.556800 | 141.763100 |
3 | 2.313700 | 2.393932 | 0.459500 | 0.194400 | 0.286300 | 0.286200 | 0.559200 | 141.571600 |
- Downloads last month
- 17
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.