Hyperparameters
learning_rate=2e-5
per_device_train_batch_size=14
per_device_eval_batch_size=14
weight_decay=0.01
save_total_limit=3
num_train_epochs=3
predict_with_generate=True
fp16=True
Training Output
global_step=7710,
training_loss=2.8554159399445727,
metrics={'train_runtime': 21924.7566,
'train_samples_per_second': 4.923,
'train_steps_per_second': 0.352,
'total_flos': 2.3807388210639667e+17,
'train_loss': 2.8554159399445727,
'epoch': 3.0}
Training Results
Epoch | Training Loss | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Bleu | Gen Len |
---|---|---|---|---|---|---|---|---|
1 | 2.981200 | 2.831641 | 0.414500 | 0.147000 | 0.230700 | 0.230600 | 0.512800 | 140.734900 |
2 | 2.800900 | 2.789402 | 0.417300 | 0.148400 | 0.231800 | 0.231700 | 0.516000 | 141.158200 |
3 | 2.680300 | 2.780862 | 0.418300 | 0.148400 | 0.232200 | 0.232100 | 0.516800 | 140.872300 |
- Downloads last month
- 36
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.