--- language: - de tags: - question-generation - german - text2text-generation - generated_from_trainer datasets: - lmqg/qg_dequad metrics: - bleu4 - f1 - rouge - exact_match model-index: - name: german-jeopardy-mt5-large-256 results: - task: name: Sequence-to-sequence Language Modeling type: text2text-generation dataset: name: lmqg/qg_dequad type: default args: default metrics: - name: BLEU-4 type: bleu4 value: 16.43 - name: F1 type: f1 value: 42.48 - name: ROUGE-1 type: rouge1 value: 43.56 - name: ROUGE-2 type: rouge2 value: 23.78 - name: ROUGE-L type: rougel value: 41.81 - name: ROUGE-Lsum type: rougelsum value: 41.80 - name: Exact Match type: exact_match value: 3.13 --- # german-jeopardy-mt5-large-256 This model is a fine-tuned version of [google/mt5-large](https://huggingface.co/google/mt5-large) on the [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad) dataset. It achieves the following results on the evaluation set: - Loss: 1.3943 - Brevity Penalty: 0.9201 - System Length: 19195 - Reference Length: 20793 - ROUGE-1: 43.56 - ROUGE-2: 23.78 - ROUGE-L: 41.81 - ROUGE-Lsum: 41.80 - Exact Match: 3.13 - BLEU: 16.43 - F1: 42.48 ## Model description See [google/mt5-large](https://huggingface.co/google/mt5-large) for the model architecture. The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM. ## Intended uses & limitations This model can be used for question generation on German text. ## Training and evaluation data See [lmqg/qg_dequad](https://huggingface.co/datasets/lmqg/qg_dequad). ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 0.0001 - train_batch_size: 1 - eval_batch_size: 1 - seed: 7 - gradient_accumulation_steps: 256 - total_train_batch_size: 256 - optimizer: Adafactor - lr_scheduler_type: constant - num_epochs: 20 ### Training results | Training Loss | Epoch | Step | Validation Loss | Counts 1 | Counts 2 | Counts 3 | Counts 4 | Totals 1 | Totals 2 | Totals 3 | Totals 4 | Precisions 1 | Precisions 2 | Precisions 3 | Precisions 4 | Brevity Penalty | System Length | Reference Length | ROUGE-1 | ROUGE-2 | ROUGE-L | ROUGE-Lsum | Exact Match | BLEU | Mean Generated Length | F1 | |:-------------:|:-----:|:----:|:---------------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:--------:|:------------:|:------------:|:------------:|:------------:|:---------------:|:-------------:|:----------------:|:-------:|:-------:|:-------:|:----------:|:-----------:|:-------:|:---------------------:|:------:| | 5.932 | 0.99 | 36 | 2.4510 | 5614 | 1426 | 527 | 204 | 28835 | 26631 | 24427 | 22223 | 19.4694 | 5.3547 | 2.1574 | 0.918 | 1.0 | 28835 | 21250 | 0.1946 | 0.0763 | 0.1843 | 0.1843 | 0.0 | 3.7906 | 11.4306 | 0.2127 | | 2.3089 | 1.98 | 72 | 1.3964 | 7578 | 2696 | 1244 | 580 | 17203 | 14999 | 12795 | 10591 | 44.0505 | 17.9745 | 9.7225 | 5.4763 | 0.7904 | 17203 | 21250 | 0.3312 | 0.1655 | 0.316 | 0.3162 | 0.01 | 11.3254 | 12.6583 | 0.3246 | | 1.6778 | 3.0 | 109 | 1.2660 | 7961 | 3020 | 1480 | 747 | 17067 | 14863 | 12659 | 10455 | 46.6456 | 20.3189 | 11.6913 | 7.1449 | 0.7826 | 17067 | 21250 | 0.3608 | 0.1881 | 0.3456 | 0.3454 | 0.0195 | 13.128 | 12.4682 | 0.3517 | | 1.5383 | 3.99 | 145 | 1.2212 | 7948 | 3121 | 1558 | 796 | 16694 | 14490 | 12286 | 10082 | 47.6099 | 21.539 | 12.6811 | 7.8953 | 0.7612 | 16694 | 21250 | 0.3663 | 0.1989 | 0.3523 | 0.352 | 0.024 | 13.625 | 12.221 | 0.3554 | | 1.423 | 4.97 | 181 | 1.1706 | 8746 | 3590 | 1840 | 963 | 17765 | 15561 | 13357 | 11153 | 49.2316 | 23.0705 | 13.7755 | 8.6344 | 0.8219 | 17765 | 21250 | 0.4033 | 0.2224 | 0.3876 | 0.3874 | 0.0304 | 15.7567 | 13.0277 | 0.3941 | | 1.2861 | 5.99 | 218 | 1.1327 | 8885 | 3646 | 1864 | 1005 | 17406 | 15202 | 12998 | 10794 | 51.0456 | 23.9837 | 14.3407 | 9.3107 | 0.8018 | 17406 | 21250 | 0.4181 | 0.2295 | 0.4022 | 0.402 | 0.0331 | 16.123 | 12.9142 | 0.4092 | | 1.2372 | 6.98 | 254 | 1.1248 | 9122 | 3824 | 1997 | 1084 | 17310 | 15106 | 12902 | 10698 | 52.6979 | 25.3144 | 15.4782 | 10.1327 | 0.7964 | 17310 | 21250 | 0.4313 | 0.239 | 0.4175 | 0.4172 | 0.0358 | 17.0334 | 12.8412 | 0.4236 | | 1.1307 | 8.0 | 291 | 1.0998 | 9423 | 4019 | 2136 | 1190 | 18074 | 15870 | 13666 | 11462 | 52.1357 | 25.3245 | 15.63 | 10.3821 | 0.8389 | 18074 | 21250 | 0.441 | 0.249 | 0.4255 | 0.4252 | 0.0404 | 18.0474 | 13.4138 | 0.4327 | | 1.0982 | 8.99 | 327 | 1.1052 | 9450 | 4003 | 2147 | 1184 | 18145 | 15941 | 13737 | 11533 | 52.0805 | 25.1113 | 15.6293 | 10.2662 | 0.8427 | 18145 | 21250 | 0.4427 | 0.2492 | 0.4266 | 0.4261 | 0.0426 | 18.0367 | 13.4465 | 0.4344 | | 1.0449 | 9.98 | 363 | 1.0996 | 9471 | 4036 | 2149 | 1180 | 18067 | 15863 | 13659 | 11455 | 52.4215 | 25.4429 | 15.7332 | 10.3012 | 0.8385 | 18067 | 21250 | 0.4422 | 0.2477 | 0.4261 | 0.4257 | 0.0404 | 18.0793 | 13.333 | 0.4341 | | 0.9686 | 10.99 | 400 | 1.1012 | 9612 | 4165 | 2240 | 1233 | 17983 | 15779 | 13575 | 11371 | 53.4505 | 26.3958 | 16.5009 | 10.8434 | 0.8339 | 17983 | 21250 | 0.4534 | 0.2591 | 0.4381 | 0.4378 | 0.0449 | 18.6914 | 13.3534 | 0.4458 | | 0.9465 | 11.98 | 436 | 1.1027 | 9670 | 4154 | 2229 | 1239 | 18217 | 16013 | 13809 | 11605 | 53.0823 | 25.9414 | 16.1416 | 10.6764 | 0.8466 | 18217 | 21250 | 0.4531 | 0.258 | 0.4377 | 0.4374 | 0.0445 | 18.6863 | 13.5912 | 0.4452 | | 0.9025 | 12.97 | 472 | 1.1124 | 9627 | 4155 | 2241 | 1247 | 18076 | 15872 | 13668 | 11464 | 53.2585 | 26.1782 | 16.396 | 10.8775 | 0.839 | 18076 | 21250 | 0.4531 | 0.2583 | 0.4386 | 0.4382 | 0.0436 | 18.7344 | 13.5259 | 0.4452 | | 0.8402 | 13.99 | 509 | 1.1392 | 9425 | 4071 | 2176 | 1207 | 17339 | 15135 | 12931 | 10727 | 54.3572 | 26.8979 | 16.8278 | 11.252 | 0.7981 | 17339 | 21250 | 0.4495 | 0.2568 | 0.4365 | 0.4358 | 0.0445 | 18.3062 | 12.9129 | 0.4417 | | 0.8282 | 14.98 | 545 | 1.1227 | 9803 | 4274 | 2316 | 1305 | 18652 | 16448 | 14244 | 12040 | 52.5574 | 25.9849 | 16.2595 | 10.8389 | 0.87 | 18652 | 21250 | 0.4573 | 0.2627 | 0.4418 | 0.4414 | 0.0463 | 19.2695 | 14.0104 | 0.4496 | | 0.7694 | 16.0 | 582 | 1.1394 | 9740 | 4240 | 2299 | 1296 | 18281 | 16077 | 13873 | 11669 | 53.2794 | 26.3731 | 16.5718 | 11.1064 | 0.8501 | 18281 | 21250 | 0.4572 | 0.2629 | 0.4411 | 0.4412 | 0.0476 | 19.1704 | 13.6475 | 0.4492 | | 0.7589 | 16.99 | 618 | 1.1497 | 9663 | 4140 | 2214 | 1232 | 18412 | 16208 | 14004 | 11800 | 52.4821 | 25.5429 | 15.8098 | 10.4407 | 0.8572 | 18412 | 21250 | 0.4515 | 0.2561 | 0.4359 | 0.4358 | 0.044 | 18.5906 | 13.7926 | 0.4432 | | 0.724 | 17.98 | 654 | 1.1680 | 9743 | 4246 | 2316 | 1300 | 18402 | 16198 | 13994 | 11790 | 52.9453 | 26.2131 | 16.5499 | 11.0263 | 0.8566 | 18402 | 21250 | 0.4562 | 0.2625 | 0.4408 | 0.441 | 0.0472 | 19.2167 | 13.7214 | 0.4474 | | 0.6755 | 18.99 | 691 | 1.1874 | 9722 | 4266 | 2351 | 1341 | 18272 | 16068 | 13864 | 11660 | 53.2071 | 26.5497 | 16.9576 | 11.5009 | 0.8496 | 18272 | 21250 | 0.4559 | 0.2639 | 0.4417 | 0.4413 | 0.0495 | 19.4647 | 13.6071 | 0.4469 | | 0.657 | 19.79 | 720 | 1.1845 | 9920 | 4361 | 2402 | 1373 | 18884 | 16680 | 14476 | 12272 | 52.5312 | 26.1451 | 16.593 | 11.1881 | 0.8822 | 18884 | 21250 | 0.4594 | 0.2647 | 0.4423 | 0.4421 | 0.0467 | 19.8248 | 14.2001 | 0.4508 | ### Framework versions - Transformers 4.32.1 - Pytorch 2.1.0 - Datasets 2.12.0 - Tokenizers 0.13.3