Edit model card

model_v2

This model is a fine-tuned version of facebook/bart-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2418
  • Sacrebleu: 66.7409

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Sacrebleu
No log 1.0 218 0.6656 66.6707
No log 2.0 437 0.5851 66.5767
No log 3.0 656 0.6062 66.4734
No log 4.0 875 0.7029 66.5944
No log 5.0 1093 0.6852 66.0086
No log 6.0 1312 0.7471 66.0534
No log 7.0 1531 0.8938 66.1986
No log 8.0 1750 0.8834 66.4626
No log 9.0 1968 0.8895 66.4292
No log 10.0 2187 0.8824 66.0577
No log 11.0 2406 0.8781 66.5076
No log 12.0 2625 0.9870 66.5564
No log 13.0 2843 1.1580 66.5116
No log 14.0 3062 0.9797 66.3801
No log 15.0 3281 1.0680 66.2748
No log 16.0 3500 1.0113 66.5282
No log 17.0 3718 1.0023 66.5794
No log 18.0 3937 1.0753 66.2935
No log 19.0 4156 1.0462 66.5036
No log 20.0 4375 1.0934 66.7931
No log 21.0 4593 1.1732 66.5171
No log 22.0 4812 1.1892 66.4821
No log 23.0 5031 1.2766 66.5913
No log 24.0 5250 1.2392 66.5476
No log 25.0 5468 1.3452 66.5616
No log 26.0 5687 1.1427 66.7916
No log 27.0 5906 1.1809 66.9823
No log 28.0 6125 1.2310 66.7958
No log 29.0 6343 1.2147 66.7948
No log 29.9 6540 1.2418 66.7409

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
3
Safetensors
Model size
406M params
Tensor type
F32
·
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Finetuned from