german-jeopardy-longt5-base-256

This model is a fine-tuned version of google/long-t5-tglobal-base on the lmqg/qg_dequad dataset. It achieves the following results on the evaluation set:

Loss: 1.7833
Brevity Penalty: 0.8244
System Length: 17427
Reference Length: 20793
ROUGE-1: 34.80
ROUGE-2: 16.54
ROUGE-L: 33.69
ROUGE-Lsum: 33.70
Exact Match: 1.50
BLEU: 10.52
F1: 33.92

Model description

See google/long-t5-tglobal-base for more information about the model architecture.
The model was trained on a single NVIDIA RTX 3090 GPU with 24GB of VRAM.

Intended uses & limitations

This model can be used for question generation on German text.

Training and evaluation data

See lmqg/qg_dequad.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 8
eval_batch_size: 4
seed: 7
gradient_accumulation_steps: 32
total_train_batch_size: 256
optimizer: Adafactor
lr_scheduler_type: constant
num_epochs: 20

Training results

Training Loss	Epoch	Step	Validation Loss	Counts 1	Counts 2	Counts 3	Counts 4	Totals 1	Totals 2	Totals 3	Totals 4	Precisions 1	Precisions 2	Precisions 3	Precisions 4	Brevity Penalty	System Length	Reference Length	ROUGE-1	ROUGE-2	ROUGE-L	ROUGE-Lsum	Exact Match	BLEU	Mean Generated Length	F1
3.6024	0.99	36	2.4682	5645	1343	424	109	15388	13184	10980	8776	36.6844	10.1866	3.8616	1.242	0.6832	15388	21250	0.2285	0.0824	0.2192	0.2188	0.0005	4.4454	11.6338	0.2236
2.9671	1.98	72	2.2445	5988	1562	569	179	16094	13890	11686	9482	37.2064	11.2455	4.8691	1.8878	0.7259	16094	21250	0.2465	0.0971	0.2371	0.2371	0.0018	5.7163	12.314	0.2401
2.6324	2.99	109	2.1227	6539	1846	702	240	17173	14969	12765	10561	38.0772	12.3322	5.4994	2.2725	0.7887	17173	21250	0.2729	0.1154	0.2601	0.2604	0.0027	6.9028	13.2319	0.2663
2.5557	3.98	145	2.0357	6491	1923	752	275	15961	13757	11553	9349	40.6679	13.9783	6.5091	2.9415	0.7179	15961	21250	0.2783	0.1214	0.2676	0.2678	0.0059	7.3331	12.0962	0.2729
2.3785	5.0	182	1.9824	6808	2113	855	328	16439	14235	12031	9827	41.4137	14.8437	7.1066	3.3377	0.7463	16439	21250	0.2948	0.1326	0.2825	0.2825	0.0064	8.2007	12.6819	0.2892
2.3396	5.99	218	1.9449	7033	2194	886	364	16851	14647	12443	10239	41.7364	14.9792	7.1205	3.555	0.7702	16851	21250	0.3044	0.1373	0.292	0.2922	0.0086	8.639	13.0254	0.3
2.2557	6.98	254	1.8938	7167	2285	939	389	16529	14325	12121	9917	43.3602	15.9511	7.7469	3.9226	0.7515	16529	21250	0.3166	0.1428	0.3043	0.3046	0.0095	9.049	12.7119	0.3119
2.1168	7.99	291	1.8575	7347	2425	1021	425	16860	14656	12452	10248	43.5765	16.5461	8.1995	4.1472	0.7708	16860	21250	0.3258	0.1505	0.3137	0.3142	0.0104	9.6447	12.9374	0.3211
2.1105	8.98	327	1.8284	7460	2461	1061	449	17034	14830	12626	10422	43.7948	16.5947	8.4033	4.3082	0.7807	17034	21250	0.3317	0.1521	0.3187	0.3191	0.0095	9.9436	13.1828	0.3267
1.9913	10.0	364	1.8057	7547	2537	1105	487	17005	14801	12597	10393	44.3811	17.1407	8.7719	4.6858	0.7791	17005	21250	0.335	0.1566	0.323	0.3233	0.0113	10.3601	13.0358	0.3316
1.9943	10.99	400	1.7973	7629	2574	1131	496	16842	14638	12434	10230	45.2975	17.5844	9.096	4.8485	0.7697	16842	21250	0.343	0.1594	0.3296	0.33	0.0113	10.5378	13.0154	0.3385
1.941	11.98	436	1.7773	7681	2606	1164	528	17105	14901	12697	10493	44.905	17.4888	9.1675	5.0319	0.7848	17105	21250	0.3421	0.1607	0.3295	0.3294	0.0132	10.8273	13.1361	0.3385
1.8453	12.99	473	1.7595	7817	2700	1224	560	17324	15120	12916	10712	45.1224	17.8571	9.4766	5.2278	0.7972	17324	21250	0.3492	0.1662	0.3367	0.3367	0.0127	11.2687	13.5018	0.3447
1.85	13.98	509	1.7414	7792	2642	1182	537	17417	15213	13009	10805	44.7379	17.3667	9.086	4.9699	0.8025	17417	21250	0.3458	0.1632	0.3322	0.3322	0.0127	10.9825	13.5395	0.3416
1.7588	15.0	546	1.7346	7827	2702	1223	569	17265	15061	12857	10653	45.3345	17.9404	9.5123	5.3412	0.7939	17265	21250	0.3487	0.1661	0.3355	0.3354	0.015	11.3189	13.3026	0.3446
1.7663	15.99	582	1.7191	7946	2757	1245	581	17431	15227	13023	10819	45.5855	18.106	9.56	5.3702	0.8032	17431	21250	0.3544	0.1695	0.3418	0.3416	0.0154	11.5245	13.4515	0.3501
1.7317	16.98	618	1.7133	8068	2844	1325	633	17752	15548	13344	11140	45.4484	18.2917	9.9296	5.6822	0.8212	17752	21250	0.3575	0.1746	0.3445	0.3447	0.0163	12.0845	13.77	0.3527
1.6421	17.99	655	1.7198	8003	2823	1301	609	17535	15331	13127	10923	45.6401	18.4137	9.9109	5.5754	0.8091	17535	21250	0.3576	0.1737	0.3447	0.3448	0.015	11.877	13.4669	0.353
1.6543	18.98	691	1.7151	8031	2817	1294	612	17803	15599	13395	11191	45.1104	18.0588	9.6603	5.4687	0.824	17803	21250	0.3567	0.1734	0.3435	0.3431	0.015	11.8679	13.8648	0.351
1.5702	19.78	720	1.7079	7996	2850	1330	639	17275	15071	12867	10663	46.2865	18.9105	10.3365	5.9927	0.7945	17275	21250	0.3618	0.1769	0.3485	0.348	0.0168	12.1229	13.3367	0.3569

Framework versions

Transformers 4.32.1
Pytorch 2.1.0
Datasets 2.12.0
Tokenizers 0.13.3

GiantTreeG
/

german-jeopardy-longt5-base-256