File size: 10,257 Bytes
078ae09
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
language:
  - de
tags:
  - question-generation
  - german
  - text2text-generation
  - generated_from_trainer
datasets:
  - lmqg/qg_dequad
metrics:
  - bleu4
  - f1
  - rouge
  - exact_match
model-index:
  - name: german-jeopardy-mt5-base-128
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: lmqg/qg_dequad
          type: default
          args: default
        metrics:
          - name: BLEU-4
            type: bleu4
            value: 14.62
          - name: F1
            type: f1
            value: 39.47
          - name: ROUGE-1
            type: rouge1
            value: 40.45
          - name: ROUGE-2
            type: rouge2
            value: 21.49
          - name: ROUGE-L
            type: rougel
            value: 39.02
          - name: ROUGE-Lsum
            type: rougelsum
            value: 39.01
          - name: Exact Match
            type: exact_match
            value: 2.68
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# german-jeopardy-mt5-base-128

This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset.
It achieves the following results on the evaluation set:
- Loss: 1.56
- Brevity Penalty: 0.8709
- System Length: 18267
- Reference Length: 20793
- ROUGE-1: 40.45
- ROUGE-2: 21.49
- ROUGE-L: 39.02
- ROUGE-Lsum: 39.01
- Exact Match: 2.68
- BLEU: 14.62
- F1: 39.47

## Model description

See [google/mt5-base](https://huggingface.co/google/mt5-base) for the model architecture.  
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.

## Intended uses & limitations

This model can be used for question generation on German text.

## Training and evaluation data

See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad).

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 7
- gradient_accumulation_steps: 32
- total_train_batch_size: 128
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20

### Training results

| Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match |  BLEU   | Mean Generated Length |   F1   |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:|
|    6.6905     | 0.99  |  72  |     2.0972      |   5515   |   1394   |   522    |   191    |  28172   |  25968   |  23764   |  21560   |   19.5762    |    5.3681    |    2.1966    |    0.8859    |       1.0       |     28172     |      21250       | 0.1942  | 0.0761  | 0.1837  |   0.1841   |     0.0     | 3.7816  |        11.2786        | 0.2106 |
|    2.4978     | 1.99  | 145  |     1.6211      |   7079   |   2339   |   1027   |   446    |  16544   |  14340   |  12136   |   9932   |   42.7889    |    16.311    |    8.4624    |    4.4905    |     0.7524      |     16544     |      21250       | 0.3097  | 0.1455  | 0.2971  |   0.2969   |    0.01     | 9.6021  |        12.0159        | 0.3032 |
|    2.1021     |  3.0  | 218  |     1.5342      |   7507   |   2637   |   1222   |   575    |  17211   |  15007   |  12803   |  10599   |   43.6175    |   17.5718    |    9.5446    |    5.425     |     0.7908      |     17211     |      21250       | 0.3304  | 0.1642  | 0.3172  |   0.3171   |   0.0141    | 11.162  |        12.6375        | 0.3228 |
|    1.9208     |  4.0  | 291  |     1.4862      |   7599   |   2755   |   1296   |   620    |  16871   |  14667   |  12463   |  10259   |   45.0418    |   18.7837    |   10.3988    |    6.0435    |     0.7714      |     16871     |      21250       | 0.3377  | 0.1721  | 0.3232  |   0.3229   |    0.015    | 11.7136 |        12.3938        |  0.33  |
|    1.8135     | 4.99  | 363  |     1.4626      |   7831   |   2955   |   1424   |   694    |  17184   |  14980   |  12776   |  10572   |   45.5715    |   19.7263    |   11.1459    |    6.5645    |     0.7893      |     17184     |      21250       | 0.3497  | 0.1837  | 0.3358  |   0.3354   |   0.0177    | 12.6402 |        12.6366        | 0.3417 |
|    1.6907     | 5.99  | 436  |     1.4392      |   7872   |   3023   |   1482   |   740    |  16907   |  14703   |  12499   |  10295   |   46.5606    |   20.5604    |   11.8569    |    7.188     |     0.7735      |     16907     |      21250       | 0.3566  | 0.1896  | 0.3432  |   0.343    |   0.0177    | 13.0722 |        12.564         | 0.3483 |
|    1.6159     | 6.99  | 509  |     1.4288      |   7981   |   3128   |   1542   |   773    |  17016   |  14812   |  12608   |  10404   |   46.9029    |    21.118    |   12.2303    |    7.4298    |     0.7797      |     17016     |      21250       |  0.363  | 0.1952  | 0.3504  |   0.3502   |   0.0191    | 13.5053 |        12.5749        | 0.3543 |
|     1.556     |  8.0  | 582  |     1.4132      |   8014   |   3046   |   1496   |   748    |  17320   |  15116   |  12912   |  10708   |   46.2702    |   20.1508    |   11.5861    |    6.9854    |      0.797      |     17320     |      21250       | 0.3632  | 0.1903  | 0.3489  |   0.3491   |   0.0222    | 13.2095 |        12.7641        | 0.355  |
|    1.4951     |  9.0  | 655  |     1.3926      |   8342   |   3271   |   1622   |   819    |  17178   |  14974   |  12770   |  10566   |   48.5621    |   21.8445    |   12.7016    |    7.7513    |      0.789      |     17178     |      21250       | 0.3843  | 0.2059  | 0.3704  |   0.3704   |   0.0218    | 14.1831 |        12.7654        | 0.3769 |
|    1.4522     | 9.99  | 727  |     1.3769      |   8639   |   3449   |   1740   |   891    |  17708   |  15504   |  13300   |  11096   |   48.7859    |   22.2459    |   13.0827    |    8.0299    |     0.8187      |     17708     |      21250       | 0.3972  | 0.2129  | 0.3821  |   0.3823   |    0.024    | 15.0442 |        13.1016        | 0.3895 |
|    1.3663     | 10.99 | 800  |     1.3677      |   8736   |   3468   |   1747   |   924    |  17674   |  15470   |  13266   |  11062   |   49.4285    |   22.4176    |    13.169    |    8.3529    |     0.8168      |     17674     |      21250       | 0.4027  |  0.215  | 0.3871  |   0.387    |   0.0245    | 15.2622 |        13.0399        | 0.3946 |
|    1.3122     | 11.99 | 873  |     1.3521      |   8833   |   3533   |   1780   |   915    |  17927   |  15723   |  13519   |  11315   |    49.272    |   22.4703    |   13.1667    |    8.0866    |     0.8308      |     17927     |      21250       | 0.4055  |  0.219  | 0.3915  |   0.3915   |   0.0222    | 15.3943 |        13.3494        | 0.3975 |
|    1.2641     | 13.0  | 946  |     1.3494      |   9048   |   3668   |   1864   |   989    |  18242   |  16038   |  13834   |  11630   |   49.5998    |   22.8707    |    13.474    |    8.5039    |      0.848      |     18242     |      21250       | 0.4165  | 0.2265  | 0.4011  |   0.401    |   0.0268    | 16.1011 |        13.5508        | 0.408  |
|    1.2359     | 13.99 | 1018 |     1.3488      |   9075   |   3709   |   1907   |   1013   |  18098   |  15894   |  13690   |  11486   |   50.1437    |   23.3359    |   13.9299    |    8.8194    |     0.8402      |     18098     |      21250       | 0.4195  | 0.2298  | 0.4041  |   0.4038   |   0.0259    | 16.3595 |        13.5681        | 0.4113 |
|    1.1754     | 14.99 | 1091 |     1.3482      |   9182   |   3777   |   1957   |   1048   |  18366   |  16162   |  13958   |  11754   |   49.9946    |   23.3696    |   14.0206    |    8.9161    |     0.8547      |     18366     |      21250       | 0.4227  | 0.2314  |  0.406  |   0.4058   |   0.0268    | 16.7083 |        13.6534        | 0.4145 |
|    1.1367     | 15.99 | 1164 |     1.3501      |   9164   |   3761   |   1935   |   1033   |  18310   |  16106   |  13902   |  11698   |   50.0492    |   23.3515    |   13.9189    |    8.8306    |     0.8517      |     18310     |      21250       | 0.4225  | 0.2316  | 0.4078  |   0.4079   |   0.0245    | 16.5803 |        13.6152        | 0.4147 |
|     1.096     | 17.0  | 1237 |     1.3586      |   9126   |   3712   |   1922   |   1050   |  18277   |  16073   |  13869   |  11665   |   49.9316    |   23.0946    |   13.8582    |    9.0013    |     0.8499      |     18277     |      21250       | 0.4217  | 0.2304  | 0.4066  |   0.4066   |   0.0295    | 16.5513 |        13.6325        | 0.4141 |
|    1.0571     | 18.0  | 1310 |     1.3658      |   9087   |   3707   |   1923   |   1033   |  18179   |  15975   |  13771   |  11567   |   49.9862    |    23.205    |   13.9641    |    8.9306    |     0.8446      |     18179     |      21250       | 0.4196  | 0.2301  | 0.4049  |   0.4049   |    0.029    | 16.4708 |        13.5172        | 0.4116 |
|     1.036     | 18.99 | 1382 |     1.3672      |   9206   |   3806   |   1976   |   1059   |  18332   |  16128   |  13924   |  11720   |   50.2182    |   23.5987    |   14.1913    |    9.0358    |     0.8528      |     18332     |      21250       | 0.4254  | 0.2348  | 0.4106  |   0.4107   |   0.0309    | 16.8386 |        13.7205        | 0.4174 |
|    0.9785     | 19.79 | 1440 |     1.3819      |   9180   |   3796   |   1973   |   1059   |  18164   |  15960   |  13756   |  11552   |   50.5395    |   23.7845    |   14.3428    |    9.1672    |     0.8438      |     18164     |      21250       | 0.4254  | 0.2344  | 0.4116  |   0.4117   |   0.0327    | 16.8234 |        13.5113        | 0.4172 |


### Framework versions

- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3