File size: 10,266 Bytes
8875396
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
---
language:
  - de
tags:
  - question-generation
  - german
  - text2text-generation
  - generated_from_trainer
datasets:
  - lmqg/qg_dequad
metrics:
  - bleu4
  - f1
  - rouge
  - exact_match
model-index:
  - name: german-jeopardy-mt5-large-128
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: lmqg/qg_dequad
          type: default
          args: default
        metrics:
          - name: BLEU-4
            type: bleu4
            value: 16.06
          - name: F1
            type: f1
            value: 42.29
          - name: ROUGE-1
            type: rouge1
            value: 43.40
          - name: ROUGE-2
            type: rouge2
            value: 23.68
          - name: ROUGE-L
            type: rougel
            value: 41.78
          - name: ROUGE-Lsum
            type: rougelsum
            value: 41.79
          - name: Exact Match
            type: exact_match
            value: 3.18
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# german-jeopardy-mt5-large-128

This model is a fine-tuned version of [google/mt5-large](https://huggingface.co/google/mt5-large) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset.
It achieves the following results on the evaluation set:
- Loss: 1.5487
- Brevity Penalty: 0.9115
- System Length: 19029
- Reference Length: 20793
- ROUGE-1: 43.40
- ROUGE-2: 23.68
- ROUGE-L: 41.78
- ROUGE-Lsum: 41.79
- Exact Match: 3.18
- BLEU: 16.06
- F1: 42.29

## Model description

See [google/mt5-large](https://huggingface.co/google/mt5-large) for the model architecture.  
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.

## Intended uses & limitations

This model can be used for question generation on German text.

## Training and evaluation data

See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad).

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 7
- gradient_accumulation_steps: 128
- total_train_batch_size: 128
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20

### Training results

| Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match |  BLEU   | Mean Generated Length |   F1   |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:|
|    3.9659     | 0.99  |  72  |     1.4145      |   7244   |   2547   |   1183   |   565    |  16296   |  14092   |  11888   |   9684   |   44.4526    |   18.0741    |    9.9512    |    5.8344    |     0.7379      |     16296     |      21250       | 0.3213  | 0.1608  | 0.3091  |   0.309    |   0.0136    | 10.8438 |        11.7786        | 0.3139 |
|    1.7081     | 1.99  | 145  |     1.2632      |   7865   |   3037   |   1498   |   759    |  16841   |  14637   |  12433   |  10229   |   46.7015    |   20.7488    |   12.0486    |    7.4201    |     0.7697      |     16841     |      21250       | 0.3577  |  0.189  | 0.3438  |   0.3439   |   0.0181    | 13.2044 |        12.225         | 0.3481 |
|    1.4856     |  3.0  | 218  |     1.1974      |   8608   |   3519   |   1818   |   969    |  17627   |  15423   |  13219   |  11015   |   48.8342    |   22.8166    |   13.7529    |    8.7971    |     0.8142      |     17627     |      21250       | 0.3969  | 0.2181  |  0.381  |   0.3812   |   0.0268    | 15.6014 |        13.0027        | 0.3882 |
|    1.3277     |  4.0  | 291  |     1.1394      |   9018   |   3702   |   1907   |   1029   |  17465   |  15261   |  13057   |  10853   |   51.6347    |   24.2579    |   14.6052    |    9.4812    |     0.8052      |     17465     |      21250       |  0.424  | 0.2321  | 0.4087  |   0.4085   |   0.0313    | 16.4313 |        12.8716        | 0.4156 |
|    1.2314     | 4.99  | 363  |     1.1193      |   9240   |   3869   |   1994   |   1076   |  17794   |  15590   |  13386   |  11182   |   51.9276    |   24.8172    |   14.8962    |    9.6226    |     0.8235      |     17794     |      21250       | 0.4336  | 0.2413  | 0.4183  |   0.418    |   0.0363    | 17.0718 |        13.2137        | 0.4256 |
|    1.1264     | 5.99  | 436  |     1.1086      |   9263   |   3908   |   2055   |   1127   |  17502   |  15298   |  13094   |  10890   |   52.9254    |   25.5458    |   15.6942    |   10.3489    |     0.8072      |     17502     |      21250       | 0.4383  | 0.2452  | 0.4239  |   0.4237   |   0.0372    | 17.4744 |        13.034         | 0.4309 |
|    1.0469     |  7.0  | 509  |     1.1038      |   9434   |   4034   |   2146   |   1189   |  18028   |  15824   |  13620   |  11416   |   52.3297    |   25.4929    |   15.7562    |   10.4152    |     0.8363      |     18028     |      21250       | 0.4433  | 0.2505  | 0.4286  |   0.4282   |    0.039    | 18.0906 |        13.422         | 0.4348 |
|    0.9874     |  8.0  | 582  |     1.0990      |   9746   |   4265   |   2287   |   1285   |  18351   |  16147   |  13943   |  11739   |   53.1088    |   26.4136    |   16.4025    |   10.9464    |     0.8539      |     18351     |      21250       |  0.457  | 0.2627  | 0.4417  |   0.4416   |   0.0454    | 19.1287 |        13.6466        | 0.4498 |
|    0.9488     | 8.99  | 654  |     1.1175      |   9484   |   4062   |   2158   |   1197   |  17831   |  15627   |  13423   |  11219   |   53.1883    |   25.9935    |   16.0769    |   10.6694    |     0.8255      |     17831     |      21250       | 0.4482  | 0.2548  | 0.4338  |   0.4333   |   0.0431    | 18.2172 |        13.2763        | 0.4399 |
|    0.8893     | 9.99  | 727  |     1.1222      |   9650   |   4205   |   2289   |   1289   |  18017   |  15813   |  13609   |  11405   |   53.5605    |    26.592    |   16.8198    |   11.3021    |     0.8357      |     18017     |      21250       | 0.4543  |  0.262  | 0.4396  |   0.4394   |   0.0463    | 19.064  |        13.4251        | 0.4472 |
|    0.8362     | 10.99 | 800  |     1.1342      |   9706   |   4232   |   2279   |   1281   |  18232   |  16028   |  13824   |  11620   |   53.2361    |   26.4038    |   16.4858    |   11.0241    |     0.8474      |     18232     |      21250       | 0.4551  | 0.2632  | 0.4395  |   0.4393   |   0.0472    | 19.052  |        13.6021        | 0.4473 |
|    0.7835     | 12.0  | 873  |     1.1427      |   9802   |   4280   |   2292   |   1285   |  18491   |  16287   |  14083   |  11879   |   53.0096    |   26.2786    |   16.2749    |   10.8174    |     0.8614      |     18491     |      21250       |  0.458  | 0.2634  | 0.4414  |   0.4412   |   0.0472    | 19.169  |        14.0168        | 0.4497 |
|    0.7441     | 12.99 | 945  |     1.1669      |   9816   |   4323   |   2334   |   1294   |  18498   |  16294   |  14090   |  11886   |   53.0652    |   26.5312    |   16.5649    |   10.8868    |     0.8618      |     18498     |      21250       | 0.4577  | 0.2659  | 0.4418  |   0.4417   |   0.0463    | 19.3443 |        13.8348        | 0.4493 |
|    0.7012     | 13.99 | 1018 |     1.1740      |   9856   |   4364   |   2375   |   1360   |  18537   |  16333   |  14129   |  11925   |   53.1693    |   26.7189    |   16.8094    |   11.4046    |     0.8639      |     18537     |      21250       | 0.4591  | 0.2653  |  0.443  |   0.4428   |   0.0476    | 19.7341 |        13.976         | 0.4514 |
|    0.6597     | 14.99 | 1091 |     1.1987      |   9780   |   4292   |   2336   |   1302   |  18468   |  16264   |  14060   |  11856   |   52.9565    |   26.3896    |   16.6145    |   10.9818    |     0.8602      |     18468     |      21250       |  0.457  | 0.2633  | 0.4418  |   0.4416   |   0.0485    | 19.3289 |        13.8802        | 0.4492 |
|    0.6236     | 16.0  | 1164 |     1.2135      |   9931   |   4388   |   2390   |   1359   |  18717   |  16513   |  14309   |  12105   |   53.0587    |    26.573    |   16.7028    |   11.2268    |     0.8734      |     18717     |      21250       | 0.4618  | 0.2682  | 0.4452  |   0.445    |   0.0495    | 19.8055 |        14.044         | 0.4538 |
|    0.5933     | 17.0  | 1237 |     1.2305      |   9806   |   4316   |   2366   |   1348   |  18566   |  16362   |  14158   |  11954   |    52.817    |   26.3782    |   16.7114    |   11.2766    |     0.8654      |     18566     |      21250       | 0.4571  | 0.2628  | 0.4407  |   0.4409   |    0.049    | 19.5893 |        14.0622        | 0.4485 |
|    0.5622     | 17.99 | 1309 |     1.2796      |   9787   |   4306   |   2346   |   1338   |  18559   |  16355   |  14151   |  11947   |   52.7345    |   26.3283    |   16.5783    |   11.1995    |      0.865      |     18559     |      21250       | 0.4549  | 0.2609  | 0.4383  |   0.4382   |   0.0476    | 19.4914 |        13.7763        | 0.447  |
|    0.5275     | 18.99 | 1382 |     1.2833      |   9918   |   4363   |   2374   |   1355   |  18950   |  16746   |  14542   |  12338   |   52.3377    |    26.054    |   16.3251    |   10.9823    |     0.8857      |     18950     |      21250       | 0.4573  | 0.2624  |  0.441  |   0.4408   |   0.0508    | 19.6947 |        14.1647        | 0.4499 |
|    0.4986     | 19.79 | 1440 |     1.3059      |   9879   |   4315   |   2347   |   1324   |  18931   |  16727   |  14523   |  12319   |   52.1842    |   25.7966    |   16.1606    |   10.7476    |     0.8847      |     18931     |      21250       | 0.4564  | 0.2622  | 0.4407  |   0.4403   |   0.0495    | 19.4544 |        14.2827        | 0.4478 |


### Framework versions

- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3