File size: 14,954 Bytes
b6d268b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
---
license: apache-2.0
tags:
- generated_from_trainer
metrics:
- sacrebleu
- bleu
- rouge
model-index:
- name: R-facebook-bart-base-full-ft-without-tum-nlp-german-gpt2_easy-prior-pp-no_ls-f135
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# R-facebook-bart-base-full-ft-without-tum-nlp-german-gpt2_easy-prior-pp-no_ls-f135

This model is a fine-tuned version of [facebook/bart-base](https://huggingface.co/facebook/bart-base) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 4.2584
- Sacrebleu: 8.2960
- Bleu: 0.0830
- Rouge1: 0.2929
- Rouge2: 0.0997
- Rougel: 0.2048
- Sari: 39.1931

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 2
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 100
- num_epochs: 15
- mixed_precision_training: Native AMP
- label_smoothing_factor: 0.1

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Sacrebleu | Bleu   | Rouge1 | Rouge2 | Rougel | Sari    |
|:-------------:|:-----:|:-----:|:---------------:|:---------:|:------:|:------:|:------:|:------:|:-------:|
| 2.3838        | 0.12  | 100   | 4.1901          | 2.7687    | 0.0277 | 0.2195 | 0.0672 | 0.1600 | 36.8973 |
| 2.2981        | 0.25  | 200   | 4.0797          | 2.0475    | 0.0205 | 0.2190 | 0.0703 | 0.1660 | 37.7972 |
| 2.2176        | 0.37  | 300   | 4.1482          | 3.1045    | 0.0310 | 0.2389 | 0.0772 | 0.1771 | 37.4184 |
| 2.1516        | 0.5   | 400   | 4.0546          | 3.1815    | 0.0318 | 0.2417 | 0.0794 | 0.1802 | 37.7797 |
| 2.1023        | 0.62  | 500   | 4.0191          | 2.7271    | 0.0273 | 0.2312 | 0.0746 | 0.1750 | 37.7930 |
| 2.1247        | 0.75  | 600   | 3.9677          | 2.5983    | 0.0260 | 0.2356 | 0.0763 | 0.1727 | 37.9143 |
| 2.0458        | 0.87  | 700   | 4.0814          | 2.3494    | 0.0235 | 0.2220 | 0.0732 | 0.1677 | 37.3185 |
| 2.0463        | 0.99  | 800   | 4.0572          | 3.7528    | 0.0375 | 0.2566 | 0.0876 | 0.1882 | 38.3515 |
| 1.9574        | 1.12  | 900   | 3.9546          | 4.5851    | 0.0459 | 0.2640 | 0.0823 | 0.1870 | 38.1778 |
| 1.921         | 1.24  | 1000  | 4.0235          | 4.4996    | 0.0450 | 0.2615 | 0.0879 | 0.1917 | 38.3405 |
| 1.9052        | 1.37  | 1100  | 4.0168          | 4.9832    | 0.0498 | 0.2804 | 0.0956 | 0.2023 | 38.2440 |
| 1.9286        | 1.49  | 1200  | 4.0049          | 4.9955    | 0.0500 | 0.2635 | 0.0868 | 0.1835 | 38.2221 |
| 1.9191        | 1.62  | 1300  | 3.9732          | 4.1180    | 0.0412 | 0.2568 | 0.0789 | 0.1825 | 37.8479 |
| 1.85          | 1.74  | 1400  | 4.0051          | 4.5305    | 0.0453 | 0.2567 | 0.0835 | 0.1855 | 38.2152 |
| 1.8769        | 1.87  | 1500  | 3.9861          | 5.0763    | 0.0508 | 0.2611 | 0.0850 | 0.1880 | 37.9740 |
| 1.8972        | 1.99  | 1600  | 4.0281          | 4.8333    | 0.0483 | 0.2573 | 0.0878 | 0.1905 | 38.2159 |
| 1.7643        | 2.11  | 1700  | 4.0955          | 5.3967    | 0.0540 | 0.2648 | 0.0883 | 0.1859 | 37.8259 |
| 1.7762        | 2.24  | 1800  | 4.0478          | 4.9561    | 0.0496 | 0.2649 | 0.0857 | 0.1904 | 37.9307 |
| 1.783         | 2.36  | 1900  | 4.0079          | 5.3380    | 0.0534 | 0.2763 | 0.0893 | 0.1928 | 38.3892 |
| 1.7744        | 2.49  | 2000  | 4.0219          | 5.6769    | 0.0568 | 0.2768 | 0.0926 | 0.2031 | 38.0914 |
| 1.7641        | 2.61  | 2100  | 3.9933          | 5.1400    | 0.0514 | 0.2696 | 0.0839 | 0.1944 | 37.8093 |
| 1.7682        | 2.74  | 2200  | 4.0418          | 4.7739    | 0.0477 | 0.2656 | 0.0840 | 0.1905 | 37.9627 |
| 1.7778        | 2.86  | 2300  | 4.0027          | 5.4326    | 0.0543 | 0.2738 | 0.0921 | 0.1945 | 38.0805 |
| 1.7106        | 2.98  | 2400  | 4.0066          | 6.2237    | 0.0622 | 0.2798 | 0.1018 | 0.2028 | 38.8314 |
| 1.7087        | 3.11  | 2500  | 4.0495          | 6.2109    | 0.0621 | 0.2855 | 0.0963 | 0.2029 | 38.4817 |
| 1.7253        | 3.23  | 2600  | 4.0248          | 5.3354    | 0.0534 | 0.2873 | 0.0957 | 0.1982 | 38.7256 |
| 1.7143        | 3.36  | 2700  | 3.9905          | 5.6144    | 0.0561 | 0.2743 | 0.0935 | 0.1959 | 38.6462 |
| 1.7731        | 3.48  | 2800  | 3.9773          | 5.0439    | 0.0504 | 0.2743 | 0.0878 | 0.1946 | 38.8186 |
| 1.6946        | 3.61  | 2900  | 4.0200          | 5.5291    | 0.0553 | 0.2818 | 0.0928 | 0.1960 | 38.3806 |
| 1.7104        | 3.73  | 3000  | 4.0039          | 5.7966    | 0.0580 | 0.2797 | 0.0942 | 0.1942 | 38.4275 |
| 1.7429        | 3.85  | 3100  | 3.9536          | 5.4509    | 0.0545 | 0.2708 | 0.0906 | 0.1940 | 38.4027 |
| 1.6642        | 3.98  | 3200  | 3.9716          | 5.5049    | 0.0550 | 0.2725 | 0.0884 | 0.1934 | 38.5143 |
| 1.6227        | 4.1   | 3300  | 4.0434          | 5.6225    | 0.0562 | 0.2876 | 0.0952 | 0.2023 | 38.6488 |
| 1.6334        | 4.23  | 3400  | 4.0302          | 6.1075    | 0.0611 | 0.2823 | 0.0934 | 0.1984 | 38.3430 |
| 1.604         | 4.35  | 3500  | 4.0565          | 5.4071    | 0.0541 | 0.2762 | 0.0898 | 0.1928 | 37.9436 |
| 1.6126        | 4.48  | 3600  | 4.0730          | 5.4640    | 0.0546 | 0.2717 | 0.0879 | 0.1953 | 38.0136 |
| 1.6703        | 4.6   | 3700  | 4.0610          | 5.9317    | 0.0593 | 0.2841 | 0.0906 | 0.1987 | 38.3703 |
| 1.6476        | 4.72  | 3800  | 4.0361          | 5.7700    | 0.0577 | 0.2764 | 0.0857 | 0.1917 | 38.3045 |
| 1.6838        | 4.85  | 3900  | 4.0013          | 6.2475    | 0.0625 | 0.2899 | 0.0950 | 0.2031 | 38.7013 |
| 1.6498        | 4.97  | 4000  | 4.0097          | 5.8688    | 0.0587 | 0.2804 | 0.0897 | 0.1953 | 38.5862 |
| 1.6005        | 5.1   | 4100  | 4.0600          | 6.3918    | 0.0639 | 0.2958 | 0.0942 | 0.2028 | 38.6827 |
| 1.6064        | 5.22  | 4200  | 4.0780          | 6.8747    | 0.0687 | 0.2907 | 0.0956 | 0.2022 | 38.3931 |
| 1.5612        | 5.35  | 4300  | 4.0645          | 6.2556    | 0.0626 | 0.2792 | 0.0867 | 0.1950 | 38.2156 |
| 1.5775        | 5.47  | 4400  | 4.0382          | 6.4081    | 0.0641 | 0.2922 | 0.0980 | 0.2053 | 38.7928 |
| 1.619         | 5.6   | 4500  | 4.0033          | 6.0250    | 0.0603 | 0.2866 | 0.0884 | 0.1997 | 38.2987 |
| 1.6027        | 5.72  | 4600  | 4.0215          | 7.0061    | 0.0701 | 0.2816 | 0.0960 | 0.1973 | 38.5188 |
| 1.5837        | 5.84  | 4700  | 4.0735          | 6.6794    | 0.0668 | 0.2846 | 0.0953 | 0.1983 | 38.3129 |
| 1.5743        | 5.97  | 4800  | 4.0566          | 6.9267    | 0.0693 | 0.2791 | 0.0920 | 0.1944 | 38.5447 |
| 1.5427        | 6.09  | 4900  | 4.0553          | 6.5612    | 0.0656 | 0.2861 | 0.0946 | 0.2002 | 38.7175 |
| 1.554         | 6.22  | 5000  | 4.0995          | 7.5212    | 0.0752 | 0.2916 | 0.1022 | 0.2043 | 38.6034 |
| 1.5205        | 6.34  | 5100  | 4.0716          | 7.3604    | 0.0736 | 0.2975 | 0.1032 | 0.2077 | 38.6330 |
| 1.5357        | 6.47  | 5200  | 4.0734          | 7.0090    | 0.0701 | 0.2834 | 0.0918 | 0.1937 | 38.3315 |
| 1.5401        | 6.59  | 5300  | 4.0569          | 7.2066    | 0.0721 | 0.2984 | 0.1007 | 0.2089 | 38.7153 |
| 1.5533        | 6.71  | 5400  | 4.0381          | 8.2701    | 0.0827 | 0.2942 | 0.1012 | 0.2048 | 38.9153 |
| 1.5758        | 6.84  | 5500  | 4.0514          | 7.7094    | 0.0771 | 0.2909 | 0.0976 | 0.2032 | 38.7672 |
| 1.5517        | 6.96  | 5600  | 4.0227          | 7.1626    | 0.0716 | 0.2859 | 0.0946 | 0.2013 | 38.9612 |
| 1.583         | 7.09  | 5700  | 4.0696          | 7.3099    | 0.0731 | 0.3068 | 0.1040 | 0.2079 | 38.9724 |
| 1.5426        | 7.21  | 5800  | 4.0742          | 7.7215    | 0.0772 | 0.2912 | 0.0993 | 0.1982 | 38.6200 |
| 1.5312        | 7.34  | 5900  | 4.0981          | 7.4710    | 0.0747 | 0.2918 | 0.1007 | 0.2005 | 38.6598 |
| 1.5297        | 7.46  | 6000  | 4.0783          | 8.1777    | 0.0818 | 0.3014 | 0.1051 | 0.2091 | 39.1750 |
| 1.5507        | 7.58  | 6100  | 4.0805          | 8.7263    | 0.0873 | 0.3077 | 0.1062 | 0.2123 | 39.0997 |
| 1.5468        | 7.71  | 6200  | 4.0709          | 7.3451    | 0.0735 | 0.2881 | 0.0979 | 0.2034 | 38.5349 |
| 1.5329        | 7.83  | 6300  | 4.0625          | 8.1881    | 0.0819 | 0.2976 | 0.1023 | 0.2056 | 38.9322 |
| 1.5859        | 7.96  | 6400  | 4.0743          | 8.3942    | 0.0839 | 0.2952 | 0.1048 | 0.2118 | 39.0793 |
| 1.4119        | 8.08  | 6500  | 4.0952          | 7.5693    | 0.0757 | 0.3094 | 0.1097 | 0.2182 | 39.0750 |
| 1.4344        | 8.21  | 6600  | 4.1497          | 8.8624    | 0.0886 | 0.3005 | 0.1041 | 0.2103 | 38.9099 |
| 1.4668        | 8.33  | 6700  | 4.1204          | 7.9935    | 0.0799 | 0.2987 | 0.1012 | 0.2060 | 39.0226 |
| 1.4787        | 8.46  | 6800  | 4.1036          | 8.2780    | 0.0828 | 0.2978 | 0.1037 | 0.2081 | 38.8047 |
| 1.4639        | 8.58  | 6900  | 4.0993          | 7.8695    | 0.0787 | 0.2927 | 0.0983 | 0.2009 | 38.6767 |
| 1.4997        | 8.7   | 7000  | 4.0572          | 7.8299    | 0.0783 | 0.2897 | 0.0968 | 0.2026 | 38.6392 |
| 1.4656        | 8.83  | 7100  | 4.1112          | 7.5026    | 0.0750 | 0.3045 | 0.1052 | 0.2100 | 39.1639 |
| 1.4423        | 8.95  | 7200  | 4.1133          | 7.3459    | 0.0735 | 0.2999 | 0.1034 | 0.2076 | 38.9450 |
| 1.3401        | 9.08  | 7300  | 4.1719          | 7.9625    | 0.0796 | 0.2916 | 0.0989 | 0.2036 | 38.8932 |
| 1.3586        | 9.2   | 7400  | 4.1550          | 7.5577    | 0.0756 | 0.2964 | 0.1003 | 0.2079 | 39.0236 |
| 1.3459        | 9.33  | 7500  | 4.1359          | 7.2886    | 0.0729 | 0.2941 | 0.0948 | 0.2013 | 39.0120 |
| 1.3972        | 9.45  | 7600  | 4.1412          | 7.2976    | 0.0730 | 0.2821 | 0.0943 | 0.2019 | 38.9448 |
| 1.4024        | 9.57  | 7700  | 4.1360          | 6.9379    | 0.0694 | 0.2891 | 0.0925 | 0.1978 | 38.9563 |
| 1.3936        | 9.7   | 7800  | 4.1180          | 7.4721    | 0.0747 | 0.2932 | 0.0979 | 0.2033 | 39.0185 |
| 1.3813        | 9.82  | 7900  | 4.1485          | 7.9716    | 0.0797 | 0.2933 | 0.1026 | 0.2060 | 39.2937 |
| 1.3519        | 9.95  | 8000  | 4.1221          | 7.9693    | 0.0797 | 0.2973 | 0.1031 | 0.2090 | 39.4926 |
| 1.2558        | 10.07 | 8100  | 4.2222          | 6.8651    | 0.0687 | 0.2855 | 0.1000 | 0.2060 | 38.9237 |
| 1.2456        | 10.2  | 8200  | 4.1953          | 6.7560    | 0.0676 | 0.2788 | 0.0918 | 0.2002 | 38.6121 |
| 1.2781        | 10.32 | 8300  | 4.2009          | 6.8235    | 0.0682 | 0.2826 | 0.0967 | 0.2042 | 39.2030 |
| 1.27          | 10.44 | 8400  | 4.2159          | 7.2854    | 0.0729 | 0.2774 | 0.0929 | 0.1976 | 39.0060 |
| 1.3036        | 10.57 | 8500  | 4.2087          | 6.3116    | 0.0631 | 0.2827 | 0.0940 | 0.2010 | 39.1980 |
| 1.2934        | 10.69 | 8600  | 4.2011          | 7.4083    | 0.0741 | 0.2880 | 0.0951 | 0.2028 | 39.0879 |
| 1.2928        | 10.82 | 8700  | 4.1859          | 7.4265    | 0.0743 | 0.2830 | 0.0996 | 0.2030 | 38.8993 |
| 1.2935        | 10.94 | 8800  | 4.1976          | 8.2571    | 0.0826 | 0.2984 | 0.1071 | 0.2190 | 39.5344 |
| 1.1764        | 11.07 | 8900  | 4.2697          | 7.0769    | 0.0708 | 0.2776 | 0.0946 | 0.1968 | 39.0592 |
| 1.2216        | 11.19 | 9000  | 4.2470          | 6.8849    | 0.0688 | 0.2821 | 0.0938 | 0.2009 | 39.0743 |
| 1.2152        | 11.31 | 9100  | 4.2621          | 7.8078    | 0.0781 | 0.2912 | 0.0986 | 0.2051 | 39.2673 |
| 1.2263        | 11.44 | 9200  | 4.2377          | 8.0541    | 0.0805 | 0.2850 | 0.1039 | 0.2068 | 39.0468 |
| 1.1959        | 11.56 | 9300  | 4.2244          | 7.6790    | 0.0768 | 0.2886 | 0.0993 | 0.2064 | 39.0468 |
| 1.1951        | 11.69 | 9400  | 4.2357          | 7.4380    | 0.0744 | 0.2952 | 0.1020 | 0.2080 | 39.2009 |
| 1.2181        | 11.81 | 9500  | 4.2293          | 7.6378    | 0.0764 | 0.2929 | 0.1026 | 0.2079 | 39.2786 |
| 1.2182        | 11.94 | 9600  | 4.2261          | 7.3868    | 0.0739 | 0.2886 | 0.0999 | 0.2079 | 39.2408 |
| 1.1386        | 12.06 | 9700  | 4.2615          | 7.3600    | 0.0736 | 0.2842 | 0.0936 | 0.2011 | 38.6407 |
| 1.1219        | 12.19 | 9800  | 4.2410          | 8.2778    | 0.0828 | 0.2905 | 0.1010 | 0.2083 | 39.5071 |
| 1.1763        | 12.31 | 9900  | 4.2356          | 7.7087    | 0.0771 | 0.2894 | 0.1001 | 0.2038 | 39.0565 |
| 1.1723        | 12.43 | 10000 | 4.2308          | 7.1490    | 0.0715 | 0.2823 | 0.0939 | 0.2036 | 39.1788 |
| 1.1212        | 12.56 | 10100 | 4.2457          | 7.7867    | 0.0779 | 0.2901 | 0.1016 | 0.2031 | 39.4189 |
| 1.1285        | 12.68 | 10200 | 4.2474          | 7.6008    | 0.0760 | 0.2886 | 0.0973 | 0.2034 | 38.9518 |
| 1.14          | 12.81 | 10300 | 4.2269          | 7.3776    | 0.0738 | 0.2864 | 0.0940 | 0.1995 | 38.9967 |
| 1.1698        | 12.93 | 10400 | 4.2179          | 7.7488    | 0.0775 | 0.2934 | 0.0989 | 0.2049 | 39.2568 |
| 1.111         | 13.06 | 10500 | 4.2544          | 7.6406    | 0.0764 | 0.2979 | 0.1009 | 0.2075 | 39.1464 |
| 1.134         | 13.18 | 10600 | 4.2493          | 7.5843    | 0.0758 | 0.2914 | 0.0977 | 0.2030 | 38.7354 |
| 1.1309        | 13.3  | 10700 | 4.2578          | 7.7002    | 0.0770 | 0.2910 | 0.0979 | 0.2042 | 39.1543 |
| 1.1817        | 13.43 | 10800 | 4.2485          | 7.7934    | 0.0779 | 0.2950 | 0.0989 | 0.2071 | 38.9693 |
| 1.1296        | 13.55 | 10900 | 4.2536          | 7.3443    | 0.0734 | 0.2897 | 0.0947 | 0.2027 | 38.5840 |
| 1.1457        | 13.68 | 11000 | 4.2430          | 7.2824    | 0.0728 | 0.2844 | 0.0927 | 0.1989 | 38.5460 |
| 1.169         | 13.8  | 11100 | 4.2319          | 7.6855    | 0.0769 | 0.2926 | 0.0966 | 0.2021 | 38.8614 |
| 1.1712        | 13.93 | 11200 | 4.2432          | 7.5547    | 0.0755 | 0.2880 | 0.0958 | 0.2008 | 38.7928 |
| 1.1777        | 14.05 | 11300 | 4.2374          | 8.0068    | 0.0801 | 0.2920 | 0.0987 | 0.2064 | 39.2165 |
| 1.1784        | 14.17 | 11400 | 4.2686          | 8.0437    | 0.0804 | 0.2938 | 0.1005 | 0.2077 | 39.1538 |
| 1.1555        | 14.3  | 11500 | 4.2601          | 7.6743    | 0.0767 | 0.2867 | 0.0963 | 0.2004 | 39.1612 |
| 1.1849        | 14.42 | 11600 | 4.2531          | 7.3441    | 0.0734 | 0.2861 | 0.0924 | 0.1975 | 38.8140 |
| 1.2111        | 14.55 | 11700 | 4.2460          | 7.9645    | 0.0796 | 0.2888 | 0.0966 | 0.2035 | 38.9464 |
| 1.1611        | 14.67 | 11800 | 4.2580          | 8.1329    | 0.0813 | 0.2898 | 0.0979 | 0.2041 | 38.9930 |
| 1.1866        | 14.8  | 11900 | 4.2536          | 8.1866    | 0.0819 | 0.2936 | 0.0990 | 0.2050 | 39.1494 |
| 1.1876        | 14.92 | 12000 | 4.2584          | 8.2960    | 0.0830 | 0.2929 | 0.0997 | 0.2048 | 39.1931 |


### Framework versions

- Transformers 4.29.2
- Pytorch 2.0.0+cu117
- Datasets 2.12.0
- Tokenizers 0.13.3