Marvin
Initial commit
3051fb5 unverified
---
language:
- de
tags:
- question-generation
- german
- text2text-generation
- generated_from_trainer
datasets:
- lmqg/qg_dequad
metrics:
- bleu4
- f1
- rouge
- exact_match
model-index:
- name: german-jeopardy-longt5-base-128
results:
- task:
name: Sequence-to-sequence Language Modeling
type: text2text-generation
dataset:
name: lmqg/qg_dequad
type: default
args: default
metrics:
- name: BLEU-4
type: bleu4
value: 10.73
- name: F1
type: f1
value: 34.55
- name: ROUGE-1
type: rouge1
value: 35.34
- name: ROUGE-2
type: rouge2
value: 16.82
- name: ROUGE-L
type: rougel
value: 34.13
- name: ROUGE-Lsum
type: rougelsum
value: 34.14
- name: Exact Match
type: exact_match
value: 1.41
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# german-jeopardy-longt5-base-128
This model is a fine-tuned version of [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset.
It achieves the following results on the evaluation set:
- Loss: 1.8010
- Brevity Penalty: 0.8577
- System Length: 18026
- Reference Length: 20793
- ROUGE-1: 35.34
- ROUGE-2: 16.82
- ROUGE-L: 34.13
- ROUGE-Lsum: 34.14
- Exact Match: 1.41
- BLEU: 10.73
- F1: 34.55
## Model description
See [google/long-t5-tglobal-base](https://huggingface.co/google/long-t5-tglobal-base) for more information about the
model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.
## Intended uses & limitations
This model can be used for question generation on German text.
## Training and evaluation data
See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad).
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 8
- eval_batch_size: 4
- seed: 7
- gradient_accumulation_steps: 16
- total_train_batch_size: 128
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20
### Training results
| Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match | BLEU | Mean Generated Length | F1 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:|
| 3.458 | 0.99 | 72 | 2.3696 | 5618 | 1383 | 463 | 116 | 15080 | 12876 | 10672 | 8468 | 37.2546 | 10.7409 | 4.3385 | 1.3699 | 0.6642 | 15080 | 21250 | 0.2266 | 0.0841 | 0.2197 | 0.2196 | 0.0005 | 4.6384 | 11.3013 | 0.2226 |
| 2.7548 | 1.99 | 145 | 2.1310 | 6361 | 1807 | 700 | 254 | 16130 | 13926 | 11722 | 9518 | 39.4358 | 12.9757 | 5.9717 | 2.6686 | 0.728 | 16130 | 21250 | 0.2706 | 0.1122 | 0.2596 | 0.2596 | 0.0036 | 6.9183 | 12.206 | 0.2635 |
| 2.5084 | 2.99 | 218 | 2.0244 | 6758 | 2001 | 780 | 285 | 16871 | 14667 | 12463 | 10259 | 40.0569 | 13.6429 | 6.2585 | 2.778 | 0.7714 | 16871 | 21250 | 0.2888 | 0.1258 | 0.2766 | 0.2767 | 0.0045 | 7.616 | 12.8825 | 0.2832 |
| 2.3562 | 4.0 | 291 | 1.9501 | 7011 | 2193 | 908 | 360 | 16796 | 14592 | 12388 | 10184 | 41.7421 | 15.0288 | 7.3297 | 3.535 | 0.7671 | 16796 | 21250 | 0.303 | 0.1375 | 0.2892 | 0.2894 | 0.0077 | 8.6611 | 12.9142 | 0.2978 |
| 2.2383 | 5.0 | 364 | 1.8874 | 7245 | 2386 | 1015 | 435 | 16708 | 14504 | 12300 | 10096 | 43.3625 | 16.4506 | 8.252 | 4.3086 | 0.762 | 16708 | 21250 | 0.3198 | 0.1498 | 0.3077 | 0.3079 | 0.0113 | 9.6159 | 12.8417 | 0.3155 |
| 2.1576 | 5.99 | 436 | 1.8593 | 7378 | 2382 | 997 | 429 | 17014 | 14810 | 12606 | 10402 | 43.3643 | 16.0837 | 7.9089 | 4.1242 | 0.7796 | 17014 | 21250 | 0.326 | 0.1497 | 0.3132 | 0.3132 | 0.0109 | 9.5745 | 13.2187 | 0.3215 |
| 2.0356 | 6.99 | 509 | 1.8133 | 7570 | 2520 | 1097 | 482 | 16999 | 14795 | 12591 | 10387 | 44.532 | 17.0328 | 8.7126 | 4.6404 | 0.7787 | 16999 | 21250 | 0.3384 | 0.158 | 0.3258 | 0.3257 | 0.0123 | 10.3053 | 13.0368 | 0.3339 |
| 1.9575 | 7.99 | 582 | 1.7856 | 7764 | 2637 | 1175 | 545 | 17379 | 15175 | 12971 | 10767 | 44.6746 | 17.3773 | 9.0587 | 5.0618 | 0.8003 | 17379 | 21250 | 0.345 | 0.1625 | 0.3322 | 0.3324 | 0.0136 | 10.993 | 13.4719 | 0.3407 |
| 1.8889 | 9.0 | 655 | 1.7666 | 7766 | 2644 | 1184 | 532 | 17102 | 14898 | 12694 | 10490 | 45.4099 | 17.7473 | 9.3272 | 5.0715 | 0.7846 | 17102 | 21250 | 0.3487 | 0.1636 | 0.3348 | 0.335 | 0.0123 | 10.9637 | 13.2164 | 0.3438 |
| 1.8201 | 10.0 | 728 | 1.7415 | 7737 | 2680 | 1238 | 587 | 17156 | 14952 | 12748 | 10544 | 45.0979 | 17.924 | 9.7113 | 5.5671 | 0.7877 | 17156 | 21250 | 0.3453 | 0.1666 | 0.3332 | 0.3333 | 0.0163 | 11.3891 | 13.1388 | 0.3406 |
| 1.7882 | 10.99 | 800 | 1.7331 | 7859 | 2722 | 1241 | 572 | 17364 | 15160 | 12956 | 10752 | 45.2603 | 17.9551 | 9.5786 | 5.3199 | 0.7995 | 17364 | 21250 | 0.3524 | 0.1673 | 0.3387 | 0.3385 | 0.0145 | 11.4047 | 13.4052 | 0.3473 |
| 1.7095 | 11.99 | 873 | 1.7194 | 7968 | 2783 | 1292 | 625 | 17467 | 15263 | 13059 | 10855 | 45.6175 | 18.2336 | 9.8936 | 5.7577 | 0.8053 | 17467 | 21250 | 0.3547 | 0.1708 | 0.3418 | 0.3414 | 0.0154 | 11.8807 | 13.4437 | 0.3495 |
| 1.6619 | 12.99 | 946 | 1.7032 | 8011 | 2796 | 1286 | 604 | 17433 | 15229 | 13025 | 10821 | 45.9531 | 18.3597 | 9.8733 | 5.5817 | 0.8034 | 17433 | 21250 | 0.3584 | 0.1736 | 0.3454 | 0.3454 | 0.0154 | 11.7968 | 13.4964 | 0.3526 |
| 1.6103 | 13.99 | 1019 | 1.7028 | 8154 | 2891 | 1347 | 636 | 17665 | 15461 | 13257 | 11053 | 46.1591 | 18.6987 | 10.1607 | 5.7541 | 0.8163 | 17665 | 21250 | 0.3659 | 0.1795 | 0.3509 | 0.3508 | 0.015 | 12.235 | 13.7223 | 0.3602 |
| 1.565 | 15.0 | 1092 | 1.6955 | 8135 | 2897 | 1362 | 665 | 17530 | 15326 | 13122 | 10918 | 46.4062 | 18.9025 | 10.3795 | 6.0909 | 0.8088 | 17530 | 21250 | 0.3668 | 0.1808 | 0.3518 | 0.3516 | 0.02 | 12.4116 | 13.6107 | 0.3603 |
| 1.522 | 16.0 | 1165 | 1.6793 | 8271 | 2982 | 1414 | 697 | 17946 | 15742 | 13538 | 11334 | 46.0883 | 18.943 | 10.4447 | 6.1496 | 0.8318 | 17946 | 21250 | 0.3695 | 0.1828 | 0.354 | 0.354 | 0.0191 | 12.8008 | 13.9192 | 0.3632 |
| 1.5022 | 16.99 | 1237 | 1.6849 | 8244 | 2967 | 1392 | 680 | 17510 | 15306 | 13102 | 10898 | 47.0817 | 19.3846 | 10.6243 | 6.2397 | 0.8077 | 17510 | 21250 | 0.3728 | 0.184 | 0.3569 | 0.3569 | 0.0191 | 12.6672 | 13.6243 | 0.366 |
| 1.4359 | 17.99 | 1310 | 1.6862 | 8328 | 3050 | 1448 | 717 | 17873 | 15669 | 13465 | 11261 | 46.5954 | 19.4652 | 10.7538 | 6.3671 | 0.8278 | 17873 | 21250 | 0.3742 | 0.1866 | 0.3582 | 0.3583 | 0.0181 | 13.0683 | 13.7255 | 0.3671 |
| 1.3994 | 18.99 | 1383 | 1.6775 | 8272 | 2998 | 1417 | 704 | 17645 | 15441 | 13237 | 11033 | 46.8801 | 19.4158 | 10.7048 | 6.3809 | 0.8152 | 17645 | 21250 | 0.3739 | 0.1866 | 0.3583 | 0.3581 | 0.0213 | 12.8728 | 13.6956 | 0.3673 |
| 1.3609 | 19.78 | 1440 | 1.6884 | 8347 | 3062 | 1465 | 723 | 17823 | 15619 | 13415 | 11211 | 46.8327 | 19.6043 | 10.9206 | 6.449 | 0.8251 | 17823 | 21250 | 0.3761 | 0.1886 | 0.3601 | 0.3596 | 0.0204 | 13.1569 | 13.7328 | 0.3692 |
### Framework versions
- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3