german-jeopardy-longt5-base

This model is a fine-tuned version of google/long-t5-tglobal-base on the lmqg/qg_dequad dataset. It achieves the following results on the evaluation set:

Loss: 1.8533
Brevity Penalty: 0.8910
System Length: 18642
Reference Length: 20793
ROUGE-1: 35.31
ROUGE-2: 16.35
ROUGE-L: 33.91
ROUGE-Lsum: 33.96
Exact Match: 1.36
BLEU: 10.80
F1: 34.41

Model description

See google/long-t5-tglobal-base for more information about the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.

Intended uses & limitations

This model can be used for question generation on German text.

Training and evaluation data

See lmqg/qg_dequad.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 4
seed: 7
gradient_accumulation_steps: 8
total_train_batch_size: 64
optimizer: Adafactor
lr_scheduler_type: constant
num_epochs: 20

Training results

Training Loss	Epoch	Step	BLEU	Brevity Penalty	Counts 1	Counts 2	Counts 3	Counts 4	Exact Match	F1	Gen Len	Validation Loss	Precisions 1	Precisions 2	Precisions 3	Precisions 4	Reference Length	ROUGE-1	ROUGE-2	ROUGE-L	ROUGE-Lsum	System Length	Totals 1	Totals 2	Totals 3	Totals 4
3.1671	1.0	145	5.9441	0.7156	6177	1669	604	179	0.0023	0.2528	12.0218	2.1902	38.7954	12.1665	5.2458	1.9227	21250	0.2595	0.1035	0.2491	0.2492	15922	15922	13718	11514	9310
2.5597	2.0	291	7.7787	0.7556	6785	2044	804	293	0.0064	0.2864	12.6084	2.0164	40.876	14.1994	6.595	2.9338	21250	0.2931	0.1291	0.2817	0.2818	16599	16599	14395	12191	9987
2.3464	2.99	436	9.2407	0.7935	7251	2326	969	400	0.0073	0.3114	13.2296	1.9138	42.0129	15.45	7.5403	3.7569	21250	0.3162	0.1456	0.3031	0.3031	17259	17259	15055	12851	10647
2.1679	4.0	582	9.6363	0.7795	7382	2393	1006	434	0.0109	0.3226	13.1207	1.8524	43.3903	16.1591	7.981	4.1727	21250	0.3272	0.1504	0.3147	0.3149	17013	17013	14809	12605	10401
2.0454	5.0	728	10.3812	0.7665	7581	2555	1111	482	0.0132	0.3357	12.9782	1.7997	45.1599	17.5204	8.9749	4.7371	21250	0.3401	0.1606	0.3278	0.3279	16787	16787	14583	12379	10175
1.9502	5.99	873	10.7668	0.7992	7759	2618	1162	511	0.0127	0.3406	13.4841	1.7696	44.6973	17.2748	8.9723	4.7548	21250	0.3452	0.1631	0.3321	0.3319	17359	17359	15155	12951	10747
1.8414	7.0	1019	11.3408	0.7721	7791	2693	1236	570	0.015	0.347	13.0563	1.7472	46.147	18.3459	9.9078	5.5496	21250	0.3513	0.1679	0.3391	0.3391	16883	16883	14679	12475	10271
1.7614	8.0	1165	11.8447	0.8198	8024	2799	1296	610	0.0145	0.352	13.515	1.7203	45.2643	18.0313	9.7305	5.4881	21250	0.3565	0.1711	0.3422	0.3423	17727	17727	15523	13319	11115
1.6997	9.0	1310	11.9689	0.8027	8046	2835	1314	615	0.0168	0.3568	13.4306	1.7167	46.183	18.6293	10.0968	5.6892	21250	0.3613	0.1746	0.3466	0.3466	17422	17422	15218	13014	10810
1.6159	10.0	1456	12.5678	0.8182	8087	2928	1395	681	0.0181	0.3564	13.5268	1.6892	45.6944	18.8976	10.4966	6.1429	21250	0.3612	0.1795	0.3485	0.3482	17698	17698	15494	13290	11086
1.5681	10.99	1601	12.497	0.813	8154	2933	1383	664	0.0168	0.3605	13.6044	1.6923	46.3164	19.0442	10.4797	6.0402	21250	0.3654	0.1789	0.3506	0.3505	17605	17605	15401	13197	10993
1.4987	12.0	1747	12.8959	0.8169	8295	3011	1432	697	0.0181	0.3675	13.6134	1.6825	46.928	19.461	10.7929	6.2997	21250	0.3734	0.1846	0.3576	0.3577	17676	17676	15472	13268	11064
1.4461	13.0	1893	12.8688	0.8139	8246	3005	1424	700	0.0191	0.3658	13.5812	1.6784	46.7964	19.4915	10.7773	6.3584	21250	0.3725	0.1857	0.358	0.3576	17621	17621	15417	13213	11009
1.4002	13.99	2038	13.4526	0.8329	8457	3130	1504	745	0.02	0.3727	13.9179	1.6725	47.0749	19.8591	11.0939	6.5621	21250	0.3797	0.1915	0.3637	0.3634	17965	17965	15761	13557	11353
1.3391	15.0	2184	13.211	0.8283	8443	3091	1468	719	0.0204	0.3737	13.9133	1.6783	47.2177	19.7168	10.8959	6.3803	21250	0.3804	0.1901	0.3634	0.363	17881	17881	15677	13473	11269
1.2921	16.0	2330	13.4907	0.8373	8457	3147	1511	747	0.0195	0.3716	13.9882	1.6738	46.8662	19.8662	11.0801	6.5337	21250	0.3782	0.1902	0.3624	0.3624	18045	18045	15841	13637	11433
1.2572	17.0	2475	13.8581	0.8267	8473	3219	1561	783	0.02	0.3753	13.7618	1.6770	47.4598	20.57	11.6103	6.9656	21250	0.3821	0.1948	0.3669	0.3665	17853	17853	15649	13445	11241
1.199	18.0	2621	13.7496	0.8326	8484	3190	1551	771	0.0186	0.3745	13.8798	1.6934	47.2409	20.2475	11.4456	6.7947	21250	0.3812	0.1922	0.3657	0.3658	17959	17959	15755	13551	11347
1.1668	18.99	2766	13.7379	0.8395	8504	3179	1541	776	0.0204	0.376	13.9256	1.6926	47.0198	20.0164	11.2663	6.7631	21250	0.3828	0.1939	0.3665	0.3665	18086	18086	15882	13678	11474
1.1164	19.91	2900	14.1906	0.8529	8625	3250	1609	820	0.0204	0.3803	14.069	1.7026	47.0463	20.15	11.5548	6.996	21250	0.3874	0.1964	0.3716	0.3715	18333	18333	16129	13925	11721

Framework versions

Transformers 4.34.1
Pytorch 2.1.0
Datasets 2.12.0
Tokenizers 0.14.1

GiantTreeG
/

german-jeopardy-longt5-base