Edit model card

R-facebook-bart-base-full-ft-with-tum-nlp-german-gpt2_easy-prior-pp-no-ls-4c77

This model is a fine-tuned version of facebook/bart-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.1506
  • Sacrebleu: 7.6134
  • Bleu: 0.0761
  • Rouge1: 0.3006
  • Rouge2: 0.1038
  • Rougel: 0.2079
  • Sari: 39.5909

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 15
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Sacrebleu Bleu Rouge1 Rouge2 Rougel Sari
6.9721 0.25 100 4.1739 1.8048 0.0180 0.1980 0.0611 0.1541 37.1235
3.8977 0.5 200 4.0984 1.2756 0.0128 0.2076 0.0678 0.1581 37.6186
4.035 0.75 300 4.0622 2.6499 0.0265 0.2271 0.0740 0.1741 38.1373
8.2055 0.99 400 4.0561 2.7363 0.0274 0.2332 0.0804 0.1716 38.0851
3.6957 1.24 500 4.0262 3.5110 0.0351 0.2560 0.0852 0.1852 37.9403
3.0846 1.49 600 4.0121 3.2967 0.0330 0.2471 0.0815 0.1799 37.5590
3.283 1.74 700 4.0510 3.8512 0.0385 0.2602 0.0917 0.1951 38.0037
4.7429 1.99 800 4.0048 3.4891 0.0349 0.2524 0.0850 0.1877 38.0324
3.024 2.24 900 3.9860 3.9202 0.0392 0.2633 0.0844 0.1891 37.9931
5.6861 2.49 1000 4.0493 4.4801 0.0448 0.2622 0.0878 0.1926 38.2052
3.6185 2.74 1100 4.0394 3.6710 0.0367 0.2608 0.0857 0.1866 37.9620
3.3582 2.98 1200 4.0004 5.1257 0.0513 0.2695 0.0922 0.1956 38.4845
5.0036 3.23 1300 4.0223 5.3256 0.0533 0.2752 0.0938 0.1975 38.6943
3.9904 3.48 1400 4.0040 5.0070 0.0501 0.2744 0.0927 0.1951 38.5338
3.1496 3.73 1500 4.0282 5.9234 0.0592 0.2803 0.0907 0.2002 38.2119
3.9604 3.98 1600 4.0253 5.1875 0.0519 0.2658 0.0864 0.1920 38.2336
2.9813 4.23 1700 4.0148 5.9589 0.0596 0.2891 0.0976 0.2028 38.8216
3.5448 4.48 1800 4.0071 5.2759 0.0528 0.2736 0.0867 0.1894 37.8800
3.6836 4.72 1900 4.0105 5.1414 0.0514 0.2750 0.0894 0.1982 38.3898
4.0471 4.97 2000 3.9788 5.5747 0.0557 0.2792 0.0932 0.1973 38.5705
3.3437 5.22 2100 4.0057 5.3969 0.0540 0.2827 0.0926 0.1978 38.3453
3.1657 5.47 2200 4.0439 5.4820 0.0548 0.2861 0.0946 0.2071 38.4004
2.5486 5.72 2300 4.0315 6.1738 0.0617 0.2896 0.0966 0.2048 38.5404
3.6148 5.97 2400 4.0056 6.5570 0.0656 0.2941 0.1046 0.2072 39.0698
3.1477 6.22 2500 4.0612 6.2221 0.0622 0.2806 0.0932 0.1998 38.5211
3.175 6.47 2600 4.0126 6.6920 0.0669 0.2916 0.1037 0.2122 39.1438
4.6616 6.71 2700 4.0467 6.0344 0.0603 0.2804 0.0953 0.1983 38.4171
3.109 6.96 2800 4.0420 5.8656 0.0587 0.2864 0.0983 0.2034 38.7225
3.0659 7.21 2900 4.0613 5.6029 0.0560 0.2839 0.0938 0.1980 38.7136
2.658 7.46 3000 4.0726 6.2791 0.0628 0.2824 0.0947 0.1972 38.6330
3.178 7.71 3100 4.0437 6.4351 0.0644 0.2924 0.0956 0.2032 38.6577
4.0606 7.96 3200 4.0644 6.6271 0.0663 0.2966 0.1019 0.2088 39.1513
3.664 8.21 3300 4.0615 6.3354 0.0634 0.2961 0.0981 0.2024 38.6904
2.8457 8.46 3400 4.0861 7.4278 0.0743 0.2975 0.1025 0.2017 39.0452
3.3883 8.7 3500 4.1037 6.4498 0.0645 0.2826 0.0955 0.2008 38.5961
5.4189 8.95 3600 4.1099 6.0065 0.0601 0.2946 0.0952 0.2020 38.6177
3.2093 9.2 3700 4.1074 6.2514 0.0625 0.2933 0.0942 0.2014 38.7227
3.9625 9.45 3800 4.0937 6.6653 0.0667 0.2912 0.0970 0.2020 38.4853
2.7172 9.7 3900 4.1130 6.1736 0.0617 0.2860 0.0898 0.1948 38.5064
2.4973 9.95 4000 4.0737 7.4889 0.0749 0.2986 0.1023 0.2060 39.2124
2.7371 10.2 4100 4.1032 6.4897 0.0649 0.2985 0.0990 0.2031 38.3514
3.9244 10.44 4200 4.0880 6.7268 0.0673 0.2906 0.1006 0.2012 38.6404
3.2153 10.69 4300 4.0961 6.7780 0.0678 0.2953 0.0977 0.2008 38.7091
3.0715 10.94 4400 4.1005 7.1435 0.0714 0.2870 0.0937 0.1950 38.5542
2.7833 11.19 4500 4.1112 7.5856 0.0759 0.3008 0.1037 0.2063 38.8659
5.6278 11.44 4600 4.0988 7.8870 0.0789 0.2962 0.1019 0.2025 38.8174
4.3557 11.69 4700 4.1049 7.9121 0.0791 0.3105 0.1076 0.2106 39.2476
3.4938 11.94 4800 4.1067 7.1602 0.0716 0.2961 0.1009 0.2039 38.9165
5.6848 12.19 4900 4.1140 7.8746 0.0787 0.2951 0.0996 0.2005 38.7719
3.4738 12.43 5000 4.0969 7.8672 0.0787 0.3055 0.1087 0.2092 39.0808
2.9039 12.68 5100 4.1185 7.6696 0.0767 0.3033 0.1071 0.2092 39.0788
4.4091 12.93 5200 4.1346 7.9896 0.0799 0.3014 0.1046 0.2070 39.2032
3.102 13.18 5300 4.1308 7.2969 0.0730 0.3030 0.1032 0.2039 39.1031
2.9972 13.43 5400 4.1518 7.7779 0.0778 0.3017 0.1053 0.2090 39.4092
2.7672 13.68 5500 4.1515 7.7545 0.0775 0.3010 0.1079 0.2091 39.0093
3.7358 13.93 5600 4.1360 7.5980 0.0760 0.2970 0.1036 0.2080 39.0873
3.4363 14.17 5700 4.1367 7.2901 0.0729 0.3013 0.1057 0.2084 39.3389
3.3451 14.42 5800 4.1500 7.5605 0.0756 0.2984 0.0979 0.2074 39.0107
2.8616 14.67 5900 4.1447 7.8204 0.0782 0.3020 0.1059 0.2127 39.7465
3.1149 14.92 6000 4.1506 7.6134 0.0761 0.3006 0.1038 0.2079 39.5909

Framework versions

  • Transformers 4.29.2
  • Pytorch 2.0.0+cu117
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
24
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.