Marvin
Initial commit
bfe0903 unverified
---
language:
- de
tags:
- question-generation
- german
- text2text-generation
- generated_from_trainer
datasets:
- lmqg/qg_dequad
metrics:
- bleu4
- f1
- rouge
- exact_match
model-index:
- name: german-jeopardy-mt5-base-256
results:
- task:
name: Sequence-to-sequence Language Modeling
type: text2text-generation
dataset:
name: lmqg/qg_dequad
type: default
args: default
metrics:
- name: BLEU-4
type: bleu4
value: 13.70
- name: F1
type: f1
value: 37.79
- name: ROUGE-1
type: rouge1
value: 38.80
- name: ROUGE-2
type: rouge2
value: 20.27
- name: ROUGE-L
type: rougel
value: 37.34
- name: ROUGE-Lsum
type: rougelsum
value: 37.32
- name: Exact Match
type: exact_match
value: 2.81
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# german-jeopardy-mt5-base-256
This model is a fine-tuned version of [google/mt5-base](https://huggingface.co/google/mt5-base) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset.
It achieves the following results on the evaluation set:
- Loss: 1.51
- Brevity Penalty: 0.8658
- System Length: 18174
- Reference Length: 20793
- ROUGE-1: 38.80
- ROUGE-2: 20.27
- ROUGE-L: 37.34
- ROUGE-Lsum: 37.32
- Exact Match: 2.81
- BLEU: 13.70
- F1: 37.79
## Model description
See [google/mt5-base](https://huggingface.co/google/mt5-base) for the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.
## Intended uses & limitations
This model can be used for question generation on German text.
## Training and evaluation data
See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad).
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 4
- eval_batch_size: 4
- seed: 7
- gradient_accumulation_steps: 64
- total_train_batch_size: 256
- optimizer: Adafactor
- lr_scheduler_type: constant
- num_epochs: 20
### Training results
| Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match | BLEU | Mean Generated Length | F1 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:|
| 8.9608 | 0.99 | 36 | 2.8883 | 2306 | 50 | 12 | 2 | 17876 | 15672 | 13468 | 11264 | 12.9 | 0.319 | 0.0891 | 0.0178 | 0.828 | 17876 | 21250 | 0.0081 | 0.0022 | 0.0078 | 0.0078 | 0.0 | 0.2352 | 3.1969 | 0.0092 |
| 3.2364 | 1.98 | 72 | 1.9242 | 6125 | 1727 | 687 | 277 | 21152 | 18948 | 16744 | 14540 | 28.9571 | 9.1144 | 4.103 | 1.9051 | 0.9954 | 21152 | 21250 | 0.2457 | 0.1026 | 0.2345 | 0.2346 | 0.0018 | 6.7083 | 11.8072 | 0.2514 |
| 2.4963 | 3.0 | 109 | 1.6558 | 6903 | 2271 | 975 | 409 | 16537 | 14333 | 12129 | 9925 | 41.7428 | 15.8446 | 8.0386 | 4.1209 | 0.752 | 16537 | 21250 | 0.2966 | 0.1415 | 0.2854 | 0.2852 | 0.01 | 9.1493 | 12.176 | 0.2909 |
| 2.2314 | 3.98 | 145 | 1.5771 | 7160 | 2440 | 1098 | 501 | 16627 | 14423 | 12219 | 10015 | 43.0625 | 16.9174 | 8.986 | 5.0025 | 0.7573 | 16627 | 21250 | 0.314 | 0.1535 | 0.3028 | 0.3028 | 0.0136 | 10.187 | 12.157 | 0.3069 |
| 2.0578 | 4.97 | 181 | 1.5347 | 7447 | 2625 | 1214 | 566 | 17305 | 15101 | 12897 | 10693 | 43.0338 | 17.383 | 9.413 | 5.2932 | 0.7961 | 17305 | 21250 | 0.3286 | 0.1628 | 0.3146 | 0.3146 | 0.0163 | 11.0621 | 12.5585 | 0.32 |
| 1.8928 | 5.99 | 218 | 1.5128 | 7396 | 2659 | 1257 | 611 | 16598 | 14394 | 12190 | 9986 | 44.5596 | 18.473 | 10.3117 | 6.1186 | 0.7556 | 16598 | 21250 | 0.3326 | 0.1684 | 0.3198 | 0.3198 | 0.0177 | 11.4063 | 12.1692 | 0.3234 |
| 1.8573 | 6.98 | 254 | 1.4736 | 7531 | 2758 | 1313 | 641 | 16728 | 14524 | 12320 | 10116 | 45.0203 | 18.9893 | 10.6575 | 6.3365 | 0.7631 | 16728 | 21250 | 0.3349 | 0.1717 | 0.3216 | 0.3216 | 0.0163 | 11.8292 | 12.3035 | 0.327 |
| 1.7361 | 8.0 | 291 | 1.4544 | 7658 | 2849 | 1368 | 668 | 16928 | 14724 | 12520 | 10316 | 45.2387 | 19.3494 | 10.9265 | 6.4754 | 0.7747 | 16928 | 21250 | 0.3414 | 0.1762 | 0.3283 | 0.3284 | 0.0181 | 12.2208 | 12.4628 | 0.3334 |
| 1.7162 | 8.99 | 327 | 1.4459 | 7703 | 2891 | 1390 | 694 | 16795 | 14591 | 12387 | 10183 | 45.8648 | 19.8136 | 11.2214 | 6.8153 | 0.767 | 16795 | 21250 | 0.3454 | 0.1785 | 0.3325 | 0.3323 | 0.0159 | 12.4536 | 12.4174 | 0.3374 |
| 1.6589 | 9.98 | 363 | 1.4383 | 7889 | 2983 | 1449 | 719 | 17376 | 15172 | 12968 | 10764 | 45.4017 | 19.6612 | 11.1737 | 6.6797 | 0.8002 | 17376 | 21250 | 0.3519 | 0.1816 | 0.3375 | 0.3372 | 0.0172 | 12.8553 | 12.7101 | 0.3435 |
| 1.5571 | 10.99 | 400 | 1.4214 | 7889 | 2994 | 1457 | 736 | 17185 | 14981 | 12777 | 10573 | 45.9063 | 19.9853 | 11.4033 | 6.9611 | 0.7894 | 17185 | 21250 | 0.3529 | 0.1845 | 0.3392 | 0.3393 | 0.02 | 12.9671 | 12.6466 | 0.3457 |
| 1.5502 | 11.98 | 436 | 1.4135 | 7930 | 3008 | 1477 | 741 | 16868 | 14664 | 12460 | 10256 | 47.0121 | 20.5128 | 11.8539 | 7.225 | 0.7712 | 16868 | 21250 | 0.3619 | 0.189 | 0.3492 | 0.3491 | 0.0213 | 13.0741 | 12.4483 | 0.3541 |
| 1.4564 | 13.0 | 473 | 1.3943 | 8268 | 3200 | 1616 | 837 | 17929 | 15725 | 13521 | 11317 | 46.1152 | 20.3498 | 11.9518 | 7.396 | 0.8309 | 17929 | 21250 | 0.3729 | 0.1974 | 0.3578 | 0.3576 | 0.0218 | 14.1014 | 13.2441 | 0.3647 |
| 1.4522 | 13.99 | 509 | 1.3953 | 8047 | 3130 | 1564 | 811 | 16789 | 14585 | 12381 | 10177 | 47.9302 | 21.4604 | 12.6323 | 7.9689 | 0.7667 | 16789 | 21250 | 0.3712 | 0.197 | 0.3582 | 0.3581 | 0.0227 | 13.7526 | 12.515 | 0.3627 |
| 1.407 | 14.98 | 545 | 1.3759 | 8498 | 3358 | 1703 | 877 | 17923 | 15719 | 13515 | 11311 | 47.4139 | 21.3627 | 12.6008 | 7.7535 | 0.8306 | 17923 | 21250 | 0.3856 | 0.2063 | 0.3709 | 0.3706 | 0.0213 | 14.7315 | 13.2849 | 0.3772 |
| 1.3294 | 15.99 | 582 | 1.3776 | 8481 | 3407 | 1721 | 883 | 17451 | 15247 | 13043 | 10839 | 48.5989 | 22.3454 | 13.1948 | 8.1465 | 0.8044 | 17451 | 21250 | 0.3907 | 0.211 | 0.3766 | 0.3766 | 0.024 | 14.868 | 12.9142 | 0.3822 |
| 1.3294 | 16.98 | 618 | 1.3803 | 8633 | 3464 | 1767 | 923 | 18004 | 15800 | 13596 | 11392 | 47.9505 | 21.9241 | 12.9965 | 8.1022 | 0.835 | 18004 | 21250 | 0.3946 | 0.2133 | 0.3801 | 0.3798 | 0.0263 | 15.2312 | 13.3103 | 0.3868 |
| 1.2605 | 18.0 | 655 | 1.3710 | 8560 | 3376 | 1695 | 880 | 17830 | 15626 | 13422 | 11218 | 48.009 | 21.605 | 12.6285 | 7.8445 | 0.8255 | 17830 | 21250 | 0.3922 | 0.2092 | 0.3778 | 0.3775 | 0.0231 | 14.779 | 13.1665 | 0.3846 |
| 1.2667 | 18.99 | 691 | 1.3694 | 8664 | 3455 | 1733 | 882 | 17834 | 15630 | 13426 | 11222 | 48.5814 | 22.1049 | 12.9078 | 7.8596 | 0.8257 | 17834 | 21250 | 0.3987 | 0.2138 | 0.3853 | 0.3851 | 0.0227 | 15.0008 | 13.2232 | 0.3906 |
| 1.2074 | 19.79 | 720 | 1.3658 | 8770 | 3465 | 1737 | 880 | 18039 | 15835 | 13631 | 11427 | 48.6169 | 21.8819 | 12.743 | 7.7011 | 0.8369 | 18039 | 21250 | 0.4025 | 0.215 | 0.3883 | 0.3879 | 0.0227 | 15.0442 | 13.4424 | 0.3941 |
### Framework versions
- Transformers 4.32.1
- Pytorch 2.1.0
- Datasets 2.12.0
- Tokenizers 0.13.3