Marvin
Initial commit
8875396 unverified
---
language:
- de
tags:
- question-generation
- german
- text2text-generation
- generated_from_trainer
datasets:
- lmqg/qg_dequad
metrics:
- bleu4
- f1
- rouge
- exact_match
model-index:
- name: german-jeopardy-mt5-large-128
results:
- task:
name: Sequence-to-sequence Language Modeling
type: text2text-generation
dataset:
name: lmqg/qg_dequad
type: default
args: default
metrics:
- name: BLEU-4
type: bleu4
value: 16.06
- name: F1
type: f1
value: 42.29
- name: ROUGE-1
type: rouge1
value: 43.40
- name: ROUGE-2
type: rouge2
value: 23.68
- name: ROUGE-L
type: rougel
value: 41.78
- name: ROUGE-Lsum
type: rougelsum
value: 41.79
- name: Exact Match
type: exact_match
value: 3.18
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# german-jeopardy-mt5-large-128
This model is a fine-tuned version of [google/mt5-large](https://huggingface.co/google/mt5-large) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset.
It achieves the following results on the evaluation set:
- Loss: 1.5487
- Brevity Penalty: 0.9115
- System Length: 19029
- Reference Length: 20793
- ROUGE-1: 43.40
- ROUGE-2: 23.68
- ROUGE-L: 41.78
- ROUGE-Lsum: 41.79
- Exact Match: 3.18
- BLEU: 16.06
- F1: 42.29
## Model description
See [google/mt5-large](https://huggingface.co/google/mt5-large) for the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.
## Intended uses & limitations
This model can be used for question generation on German text.
## Training and evaluation data
See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad).
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 1
- eval_batch_size: 1
- seed: 7
- gradient_accumulation_steps: 128
- total_train_batch_size: 128
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20
### Training results
| Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match | BLEU | Mean Generated Length | F1 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:|
| 3.9659 | 0.99 | 72 | 1.4145 | 7244 | 2547 | 1183 | 565 | 16296 | 14092 | 11888 | 9684 | 44.4526 | 18.0741 | 9.9512 | 5.8344 | 0.7379 | 16296 | 21250 | 0.3213 | 0.1608 | 0.3091 | 0.309 | 0.0136 | 10.8438 | 11.7786 | 0.3139 |
| 1.7081 | 1.99 | 145 | 1.2632 | 7865 | 3037 | 1498 | 759 | 16841 | 14637 | 12433 | 10229 | 46.7015 | 20.7488 | 12.0486 | 7.4201 | 0.7697 | 16841 | 21250 | 0.3577 | 0.189 | 0.3438 | 0.3439 | 0.0181 | 13.2044 | 12.225 | 0.3481 |
| 1.4856 | 3.0 | 218 | 1.1974 | 8608 | 3519 | 1818 | 969 | 17627 | 15423 | 13219 | 11015 | 48.8342 | 22.8166 | 13.7529 | 8.7971 | 0.8142 | 17627 | 21250 | 0.3969 | 0.2181 | 0.381 | 0.3812 | 0.0268 | 15.6014 | 13.0027 | 0.3882 |
| 1.3277 | 4.0 | 291 | 1.1394 | 9018 | 3702 | 1907 | 1029 | 17465 | 15261 | 13057 | 10853 | 51.6347 | 24.2579 | 14.6052 | 9.4812 | 0.8052 | 17465 | 21250 | 0.424 | 0.2321 | 0.4087 | 0.4085 | 0.0313 | 16.4313 | 12.8716 | 0.4156 |
| 1.2314 | 4.99 | 363 | 1.1193 | 9240 | 3869 | 1994 | 1076 | 17794 | 15590 | 13386 | 11182 | 51.9276 | 24.8172 | 14.8962 | 9.6226 | 0.8235 | 17794 | 21250 | 0.4336 | 0.2413 | 0.4183 | 0.418 | 0.0363 | 17.0718 | 13.2137 | 0.4256 |
| 1.1264 | 5.99 | 436 | 1.1086 | 9263 | 3908 | 2055 | 1127 | 17502 | 15298 | 13094 | 10890 | 52.9254 | 25.5458 | 15.6942 | 10.3489 | 0.8072 | 17502 | 21250 | 0.4383 | 0.2452 | 0.4239 | 0.4237 | 0.0372 | 17.4744 | 13.034 | 0.4309 |
| 1.0469 | 7.0 | 509 | 1.1038 | 9434 | 4034 | 2146 | 1189 | 18028 | 15824 | 13620 | 11416 | 52.3297 | 25.4929 | 15.7562 | 10.4152 | 0.8363 | 18028 | 21250 | 0.4433 | 0.2505 | 0.4286 | 0.4282 | 0.039 | 18.0906 | 13.422 | 0.4348 |
| 0.9874 | 8.0 | 582 | 1.0990 | 9746 | 4265 | 2287 | 1285 | 18351 | 16147 | 13943 | 11739 | 53.1088 | 26.4136 | 16.4025 | 10.9464 | 0.8539 | 18351 | 21250 | 0.457 | 0.2627 | 0.4417 | 0.4416 | 0.0454 | 19.1287 | 13.6466 | 0.4498 |
| 0.9488 | 8.99 | 654 | 1.1175 | 9484 | 4062 | 2158 | 1197 | 17831 | 15627 | 13423 | 11219 | 53.1883 | 25.9935 | 16.0769 | 10.6694 | 0.8255 | 17831 | 21250 | 0.4482 | 0.2548 | 0.4338 | 0.4333 | 0.0431 | 18.2172 | 13.2763 | 0.4399 |
| 0.8893 | 9.99 | 727 | 1.1222 | 9650 | 4205 | 2289 | 1289 | 18017 | 15813 | 13609 | 11405 | 53.5605 | 26.592 | 16.8198 | 11.3021 | 0.8357 | 18017 | 21250 | 0.4543 | 0.262 | 0.4396 | 0.4394 | 0.0463 | 19.064 | 13.4251 | 0.4472 |
| 0.8362 | 10.99 | 800 | 1.1342 | 9706 | 4232 | 2279 | 1281 | 18232 | 16028 | 13824 | 11620 | 53.2361 | 26.4038 | 16.4858 | 11.0241 | 0.8474 | 18232 | 21250 | 0.4551 | 0.2632 | 0.4395 | 0.4393 | 0.0472 | 19.052 | 13.6021 | 0.4473 |
| 0.7835 | 12.0 | 873 | 1.1427 | 9802 | 4280 | 2292 | 1285 | 18491 | 16287 | 14083 | 11879 | 53.0096 | 26.2786 | 16.2749 | 10.8174 | 0.8614 | 18491 | 21250 | 0.458 | 0.2634 | 0.4414 | 0.4412 | 0.0472 | 19.169 | 14.0168 | 0.4497 |
| 0.7441 | 12.99 | 945 | 1.1669 | 9816 | 4323 | 2334 | 1294 | 18498 | 16294 | 14090 | 11886 | 53.0652 | 26.5312 | 16.5649 | 10.8868 | 0.8618 | 18498 | 21250 | 0.4577 | 0.2659 | 0.4418 | 0.4417 | 0.0463 | 19.3443 | 13.8348 | 0.4493 |
| 0.7012 | 13.99 | 1018 | 1.1740 | 9856 | 4364 | 2375 | 1360 | 18537 | 16333 | 14129 | 11925 | 53.1693 | 26.7189 | 16.8094 | 11.4046 | 0.8639 | 18537 | 21250 | 0.4591 | 0.2653 | 0.443 | 0.4428 | 0.0476 | 19.7341 | 13.976 | 0.4514 |
| 0.6597 | 14.99 | 1091 | 1.1987 | 9780 | 4292 | 2336 | 1302 | 18468 | 16264 | 14060 | 11856 | 52.9565 | 26.3896 | 16.6145 | 10.9818 | 0.8602 | 18468 | 21250 | 0.457 | 0.2633 | 0.4418 | 0.4416 | 0.0485 | 19.3289 | 13.8802 | 0.4492 |
| 0.6236 | 16.0 | 1164 | 1.2135 | 9931 | 4388 | 2390 | 1359 | 18717 | 16513 | 14309 | 12105 | 53.0587 | 26.573 | 16.7028 | 11.2268 | 0.8734 | 18717 | 21250 | 0.4618 | 0.2682 | 0.4452 | 0.445 | 0.0495 | 19.8055 | 14.044 | 0.4538 |
| 0.5933 | 17.0 | 1237 | 1.2305 | 9806 | 4316 | 2366 | 1348 | 18566 | 16362 | 14158 | 11954 | 52.817 | 26.3782 | 16.7114 | 11.2766 | 0.8654 | 18566 | 21250 | 0.4571 | 0.2628 | 0.4407 | 0.4409 | 0.049 | 19.5893 | 14.0622 | 0.4485 |
| 0.5622 | 17.99 | 1309 | 1.2796 | 9787 | 4306 | 2346 | 1338 | 18559 | 16355 | 14151 | 11947 | 52.7345 | 26.3283 | 16.5783 | 11.1995 | 0.865 | 18559 | 21250 | 0.4549 | 0.2609 | 0.4383 | 0.4382 | 0.0476 | 19.4914 | 13.7763 | 0.447 |
| 0.5275 | 18.99 | 1382 | 1.2833 | 9918 | 4363 | 2374 | 1355 | 18950 | 16746 | 14542 | 12338 | 52.3377 | 26.054 | 16.3251 | 10.9823 | 0.8857 | 18950 | 21250 | 0.4573 | 0.2624 | 0.441 | 0.4408 | 0.0508 | 19.6947 | 14.1647 | 0.4499 |
| 0.4986 | 19.79 | 1440 | 1.3059 | 9879 | 4315 | 2347 | 1324 | 18931 | 16727 | 14523 | 12319 | 52.1842 | 25.7966 | 16.1606 | 10.7476 | 0.8847 | 18931 | 21250 | 0.4564 | 0.2622 | 0.4407 | 0.4403 | 0.0495 | 19.4544 | 14.2827 | 0.4478 |
### Framework versions
- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3