File size: 10,257 Bytes
bfe0903
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
language:
  - de
tags:
  - question-generation
  - german
  - text2text-generation
  - generated_from_trainer
datasets:
  - lmqg/qg_dequad
metrics:
  - bleu4
  - f1
  - rouge
  - exact_match
model-index:
  - name: german-jeopardy-mt5-base-256
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: lmqg/qg_dequad
          type: default
          args: default
        metrics:
          - name: BLEU-4
            type: bleu4
            value: 13.70
          - name: F1
            type: f1
            value: 37.79
          - name: ROUGE-1
            type: rouge1
            value: 38.80
          - name: ROUGE-2
            type: rouge2
            value: 20.27
          - name: ROUGE-L
            type: rougel
            value: 37.34
          - name: ROUGE-Lsum
            type: rougelsum
            value: 37.32
          - name: Exact Match
            type: exact_match
            value: 2.81
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# german-jeopardy-mt5-base-256

This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset.
It achieves the following results on the evaluation set:
- Loss: 1.51
- Brevity Penalty: 0.8658
- System Length: 18174
- Reference Length: 20793
- ROUGE-1: 38.80
- ROUGE-2: 20.27
- ROUGE-L: 37.34
- ROUGE-Lsum: 37.32
- Exact Match: 2.81
- BLEU: 13.70
- F1: 37.79

## Model description

See [google/mt5-base](https://huggingface.co/google/mt5-base) for the model architecture.  
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.

## Intended uses & limitations

This model can be used for question generation on German text.

## Training and evaluation data

See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad).

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 7
- gradient_accumulation_steps: 64
- total_train_batch_size: 256
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20

### Training results

| Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match |  BLEU   | Mean Generated Length |   F1   |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:|
|    8.9608     | 0.99  |  36  |     2.8883      |   2306   |    50    |    12    |    2     |  17876   |  15672   |  13468   |  11264   |     12.9     |    0.319     |    0.0891    |    0.0178    |      0.828      |     17876     |      21250       | 0.0081  | 0.0022  | 0.0078  |   0.0078   |     0.0     | 0.2352  |        3.1969         | 0.0092 |
|    3.2364     | 1.98  |  72  |     1.9242      |   6125   |   1727   |   687    |   277    |  21152   |  18948   |  16744   |  14540   |   28.9571    |    9.1144    |    4.103     |    1.9051    |     0.9954      |     21152     |      21250       | 0.2457  | 0.1026  | 0.2345  |   0.2346   |   0.0018    | 6.7083  |        11.8072        | 0.2514 |
|    2.4963     |  3.0  | 109  |     1.6558      |   6903   |   2271   |   975    |   409    |  16537   |  14333   |  12129   |   9925   |   41.7428    |   15.8446    |    8.0386    |    4.1209    |      0.752      |     16537     |      21250       | 0.2966  | 0.1415  | 0.2854  |   0.2852   |    0.01     | 9.1493  |        12.176         | 0.2909 |
|    2.2314     | 3.98  | 145  |     1.5771      |   7160   |   2440   |   1098   |   501    |  16627   |  14423   |  12219   |  10015   |   43.0625    |   16.9174    |    8.986     |    5.0025    |     0.7573      |     16627     |      21250       |  0.314  | 0.1535  | 0.3028  |   0.3028   |   0.0136    | 10.187  |        12.157         | 0.3069 |
|    2.0578     | 4.97  | 181  |     1.5347      |   7447   |   2625   |   1214   |   566    |  17305   |  15101   |  12897   |  10693   |   43.0338    |    17.383    |    9.413     |    5.2932    |     0.7961      |     17305     |      21250       | 0.3286  | 0.1628  | 0.3146  |   0.3146   |   0.0163    | 11.0621 |        12.5585        |  0.32  |
|    1.8928     | 5.99  | 218  |     1.5128      |   7396   |   2659   |   1257   |   611    |  16598   |  14394   |  12190   |   9986   |   44.5596    |    18.473    |   10.3117    |    6.1186    |     0.7556      |     16598     |      21250       | 0.3326  | 0.1684  | 0.3198  |   0.3198   |   0.0177    | 11.4063 |        12.1692        | 0.3234 |
|    1.8573     | 6.98  | 254  |     1.4736      |   7531   |   2758   |   1313   |   641    |  16728   |  14524   |  12320   |  10116   |   45.0203    |   18.9893    |   10.6575    |    6.3365    |     0.7631      |     16728     |      21250       | 0.3349  | 0.1717  | 0.3216  |   0.3216   |   0.0163    | 11.8292 |        12.3035        | 0.327  |
|    1.7361     |  8.0  | 291  |     1.4544      |   7658   |   2849   |   1368   |   668    |  16928   |  14724   |  12520   |  10316   |   45.2387    |   19.3494    |   10.9265    |    6.4754    |     0.7747      |     16928     |      21250       | 0.3414  | 0.1762  | 0.3283  |   0.3284   |   0.0181    | 12.2208 |        12.4628        | 0.3334 |
|    1.7162     | 8.99  | 327  |     1.4459      |   7703   |   2891   |   1390   |   694    |  16795   |  14591   |  12387   |  10183   |   45.8648    |   19.8136    |   11.2214    |    6.8153    |      0.767      |     16795     |      21250       | 0.3454  | 0.1785  | 0.3325  |   0.3323   |   0.0159    | 12.4536 |        12.4174        | 0.3374 |
|    1.6589     | 9.98  | 363  |     1.4383      |   7889   |   2983   |   1449   |   719    |  17376   |  15172   |  12968   |  10764   |   45.4017    |   19.6612    |   11.1737    |    6.6797    |     0.8002      |     17376     |      21250       | 0.3519  | 0.1816  | 0.3375  |   0.3372   |   0.0172    | 12.8553 |        12.7101        | 0.3435 |
|    1.5571     | 10.99 | 400  |     1.4214      |   7889   |   2994   |   1457   |   736    |  17185   |  14981   |  12777   |  10573   |   45.9063    |   19.9853    |   11.4033    |    6.9611    |     0.7894      |     17185     |      21250       | 0.3529  | 0.1845  | 0.3392  |   0.3393   |    0.02     | 12.9671 |        12.6466        | 0.3457 |
|    1.5502     | 11.98 | 436  |     1.4135      |   7930   |   3008   |   1477   |   741    |  16868   |  14664   |  12460   |  10256   |   47.0121    |   20.5128    |   11.8539    |    7.225     |     0.7712      |     16868     |      21250       | 0.3619  |  0.189  | 0.3492  |   0.3491   |   0.0213    | 13.0741 |        12.4483        | 0.3541 |
|    1.4564     | 13.0  | 473  |     1.3943      |   8268   |   3200   |   1616   |   837    |  17929   |  15725   |  13521   |  11317   |   46.1152    |   20.3498    |   11.9518    |    7.396     |     0.8309      |     17929     |      21250       | 0.3729  | 0.1974  | 0.3578  |   0.3576   |   0.0218    | 14.1014 |        13.2441        | 0.3647 |
|    1.4522     | 13.99 | 509  |     1.3953      |   8047   |   3130   |   1564   |   811    |  16789   |  14585   |  12381   |  10177   |   47.9302    |   21.4604    |   12.6323    |    7.9689    |     0.7667      |     16789     |      21250       | 0.3712  |  0.197  | 0.3582  |   0.3581   |   0.0227    | 13.7526 |        12.515         | 0.3627 |
|     1.407     | 14.98 | 545  |     1.3759      |   8498   |   3358   |   1703   |   877    |  17923   |  15719   |  13515   |  11311   |   47.4139    |   21.3627    |   12.6008    |    7.7535    |     0.8306      |     17923     |      21250       | 0.3856  | 0.2063  | 0.3709  |   0.3706   |   0.0213    | 14.7315 |        13.2849        | 0.3772 |
|    1.3294     | 15.99 | 582  |     1.3776      |   8481   |   3407   |   1721   |   883    |  17451   |  15247   |  13043   |  10839   |   48.5989    |   22.3454    |   13.1948    |    8.1465    |     0.8044      |     17451     |      21250       | 0.3907  |  0.211  | 0.3766  |   0.3766   |    0.024    | 14.868  |        12.9142        | 0.3822 |
|    1.3294     | 16.98 | 618  |     1.3803      |   8633   |   3464   |   1767   |   923    |  18004   |  15800   |  13596   |  11392   |   47.9505    |   21.9241    |   12.9965    |    8.1022    |      0.835      |     18004     |      21250       | 0.3946  | 0.2133  | 0.3801  |   0.3798   |   0.0263    | 15.2312 |        13.3103        | 0.3868 |
|    1.2605     | 18.0  | 655  |     1.3710      |   8560   |   3376   |   1695   |   880    |  17830   |  15626   |  13422   |  11218   |    48.009    |    21.605    |   12.6285    |    7.8445    |     0.8255      |     17830     |      21250       | 0.3922  | 0.2092  | 0.3778  |   0.3775   |   0.0231    | 14.779  |        13.1665        | 0.3846 |
|    1.2667     | 18.99 | 691  |     1.3694      |   8664   |   3455   |   1733   |   882    |  17834   |  15630   |  13426   |  11222   |   48.5814    |   22.1049    |   12.9078    |    7.8596    |     0.8257      |     17834     |      21250       | 0.3987  | 0.2138  | 0.3853  |   0.3851   |   0.0227    | 15.0008 |        13.2232        | 0.3906 |
|    1.2074     | 19.79 | 720  |     1.3658      |   8770   |   3465   |   1737   |   880    |  18039   |  15835   |  13631   |  11427   |   48.6169    |   21.8819    |    12.743    |    7.7011    |     0.8369      |     18039     |      21250       | 0.4025  |  0.215  | 0.3883  |   0.3879   |   0.0227    | 15.0442 |        13.4424        | 0.3941 |


### Framework versions

- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3