Edit model card

model_v2_v2

This model is a fine-tuned version of facebook/bart-large on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5576
  • Sacrebleu: 66.4785

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 3e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Sacrebleu
No log 1.0 109 0.5576 66.4785
No log 2.0 219 0.5702 67.0151
No log 3.0 328 0.6206 66.8798
No log 4.0 438 0.5778 66.8869
No log 5.0 547 0.6484 66.8013
No log 6.0 657 0.6747 66.6138
No log 7.0 766 0.7132 66.6173
No log 8.0 876 0.6951 66.4205
No log 9.0 985 0.7322 66.3405
No log 10.0 1095 0.7953 66.5709
No log 11.0 1204 0.8137 66.5324
No log 12.0 1314 0.8207 66.4973
No log 13.0 1423 0.8155 66.4712
No log 14.0 1533 0.8471 66.3456
No log 15.0 1642 0.8629 66.5794
No log 16.0 1752 0.9267 66.4444
No log 17.0 1861 0.9317 66.5137
No log 18.0 1971 0.9020 66.6691
No log 19.0 2080 0.9256 66.6756
No log 20.0 2190 0.9645 66.5470
No log 21.0 2299 1.0415 66.7197
No log 22.0 2409 1.1270 66.7086
No log 23.0 2518 1.0326 66.7326
No log 24.0 2628 1.0989 66.7648
No log 25.0 2737 1.0835 66.4847
No log 26.0 2847 1.1915 66.7088
No log 27.0 2956 1.0516 66.6612
No log 28.0 3066 1.1104 66.6799
No log 29.0 3175 1.1811 66.6797
No log 30.0 3285 1.1143 66.7554
No log 31.0 3394 1.0420 66.6538
No log 32.0 3504 1.0547 66.6668
No log 33.0 3613 1.0992 66.5995
No log 34.0 3723 1.0931 66.6379
No log 35.0 3832 1.0891 66.7616
No log 36.0 3942 1.1421 66.7893
No log 37.0 4051 1.1487 66.7630
No log 38.0 4161 1.1538 66.7861
No log 39.0 4270 1.1793 66.7983
No log 39.82 4360 1.1620 66.7433

Framework versions

  • Transformers 4.39.3
  • Pytorch 2.1.2
  • Datasets 2.18.0
  • Tokenizers 0.15.2
Downloads last month
0
Safetensors
Model size
406M params
Tensor type
F32
·

Finetuned from