Edit model card

LLM_Teached_Bart_From_Scratch

This model is a fine-tuned version of facebook/bart-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6350
  • Rouge1: 0.4471
  • Rouge2: 0.2259
  • Rougel: 0.3846
  • Rougelsum: 0.3845
  • Gen Len: 19.9087
  • Precision: 0.9156
  • Recall: 0.8915
  • F1: 0.9033

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 24
  • eval_batch_size: 16
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 96
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 30
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step F1 Gen Len Validation Loss Precision Recall Rouge1 Rouge2 Rougel Rougelsum
1.836 1.0 521 0.8971 19.9745 1.5560 0.9105 0.8843 0.4155 0.2028 0.3561 0.3559
1.5951 2.0 1042 0.8997 19.9353 1.5004 0.9115 0.8886 0.4333 0.2136 0.3695 0.3694
1.469 3.0 1563 0.9001 19.9385 1.4691 0.912 0.8888 0.4355 0.2176 0.3729 0.3728
1.373 4.0 2084 0.9003 19.9647 1.4658 0.9137 0.8877 0.4311 0.2164 0.3706 0.3704
1.2902 5.0 2605 0.9008 19.9498 1.4542 0.9136 0.8887 0.4368 0.2218 0.3762 0.376
1.222 6.0 3126 0.9018 19.9425 1.4584 0.914 0.8902 0.4407 0.223 0.3802 0.3798
1.1655 7.0 3647 0.9019 19.9327 1.4709 0.9145 0.89 0.4404 0.2246 0.3806 0.3803
1.11 8.0 4168 0.9026 19.9084 1.4724 0.9153 0.8906 0.4435 0.2269 0.383 0.3828
1.0629 9.0 4689 0.9028 19.928 1.4853 0.9155 0.8908 0.4431 0.2273 0.3832 0.383
1.023 10.0 5210 0.9021 19.944 1.5033 0.9152 0.8897 0.4409 0.2247 0.3819 0.3818
0.9862 11.0 5731 0.9034 19.9124 1.5074 0.9158 0.8916 0.4479 0.2278 0.3862 0.386
0.957 12.0 6252 0.903 19.9033 1.5184 0.9159 0.8909 0.4461 0.2264 0.3846 0.3847
0.9315 13.0 6773 0.9031 19.9084 1.5269 0.9156 0.8912 0.4473 0.2284 0.386 0.3858
0.9093 14.0 7294 0.9029 19.9135 1.5311 0.9155 0.8909 0.4453 0.2273 0.3846 0.3843
0.8927 15.0 7815 0.9029 19.9065 1.5351 0.9156 0.8909 0.4457 0.2267 0.3842 0.384
0.8773 16.0 8336 0.9025 19.9425 1.5440 0.9151 0.8905 0.4427 0.225 0.382 0.382
0.8806 17.0 8857 0.9036 19.8851 1.5510 0.9159 0.8919 0.4495 0.2279 0.3868 0.3869
0.8683 18.0 9378 0.9038 19.8829 1.5679 0.9161 0.8921 0.4473 0.2282 0.3856 0.3857
0.8413 19.0 9899 0.9035 19.9135 1.5745 0.9159 0.8918 0.4492 0.2282 0.3861 0.3864
0.8257 20.0 10420 0.9031 19.8996 1.5835 0.9153 0.8915 0.4471 0.2266 0.3852 0.3853
0.8097 21.0 10941 0.9034 19.9073 1.5957 0.9156 0.8919 0.4472 0.2271 0.3856 0.3856
0.7926 22.0 11462 0.9034 19.892 1.5956 0.9159 0.8916 0.4479 0.2282 0.3855 0.3857
0.7841 23.0 11983 0.9028 19.912 1.5990 0.9155 0.8908 0.4444 0.2261 0.3833 0.3834
0.7669 24.0 12504 1.6097 0.4491 0.2284 0.3872 0.387 19.9007 0.9162 0.892 0.9037
0.7733 25.0 13025 1.6060 0.4442 0.2257 0.3827 0.3828 19.9178 0.9154 0.8906 0.9027
0.7631 26.0 13546 1.6187 0.4472 0.2276 0.3861 0.3861 19.9175 0.9154 0.8915 0.9031
0.7505 27.0 14067 1.6208 0.4463 0.227 0.3852 0.3851 19.8967 0.9155 0.8914 0.9031
0.7413 28.0 14588 1.6237 0.4468 0.2273 0.3854 0.3853 19.9153 0.9159 0.8912 0.9032
0.7348 29.0 15109 1.6312 0.4482 0.2268 0.3858 0.3858 19.8938 0.9158 0.8918 0.9035
0.7286 30.0 15630 1.6350 0.4471 0.2259 0.3846 0.3845 19.9087 0.9156 0.8915 0.9033

Framework versions

  • Transformers 4.36.0
  • Pytorch 2.0.1+cu117
  • Datasets 2.14.5
  • Tokenizers 0.15.0
Downloads last month
0
Safetensors
Model size
406M params
Tensor type
F32
·

Finetuned from