bengali_qa_model_AGGRO_banglabert

This model is a fine-tuned version of csebuetnlp/banglabert on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2676
Exact Match: 98.5714
F1 Score: 99.0056

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 3407
gradient_accumulation_steps: 16
total_train_batch_size: 64
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
training_steps: 100

Training results

Training Loss	Epoch	Step	Validation Loss	Exact Match	F1 Score
6.0126	0.0053	1	5.9783	0.0	0.6103
6.0125	0.0107	2	5.9540	0.0	0.7848
5.9675	0.0160	3	5.9074	0.0	0.9597
5.9287	0.0214	4	5.8425	0.0	1.7507
5.8586	0.0267	5	5.7636	0.1504	4.2535
5.8206	0.0321	6	5.6740	0.4511	11.2628
5.7246	0.0374	7	5.5749	1.7293	23.3816
5.634	0.0428	8	5.4574	3.9850	37.7873
5.4963	0.0481	9	5.3105	5.7895	47.4987
5.2985	0.0535	10	5.1265	7.5940	52.0471
5.182	0.0588	11	4.8997	11.9549	54.5555
4.973	0.0641	12	4.6631	15.9398	56.5530
4.8353	0.0695	13	4.4348	19.6241	58.6313
4.6269	0.0748	14	4.2322	23.5338	60.7029
4.4238	0.0802	15	4.0467	28.0451	62.7494
4.1976	0.0855	16	3.8781	32.7068	64.6375
4.1302	0.0909	17	3.7200	35.7895	66.2513
3.9139	0.0962	18	3.5621	39.5489	67.7758
3.8521	0.1016	19	3.4019	43.0075	69.2899
3.7003	0.1069	20	3.2534	46.0150	70.6373
3.5972	0.1123	21	3.1168	48.8722	72.3043
3.5249	0.1176	22	2.9875	51.5038	73.2903
3.1756	0.1230	23	2.8600	53.6090	74.1609
3.2323	0.1283	24	2.7356	55.2632	74.8864
3.0696	0.1336	25	2.6150	56.8421	75.8938
2.9806	0.1390	26	2.5029	58.9474	77.3831
2.8261	0.1443	27	2.3997	61.1278	78.8467
2.8965	0.1497	28	2.3045	63.9098	80.8890
2.6622	0.1550	29	2.2151	66.0902	82.4263
2.5132	0.1604	30	2.1300	68.0451	83.8984
2.5076	0.1657	31	2.0482	70.9774	85.4846
2.2189	0.1711	32	1.9678	72.7068	86.2628
2.0851	0.1764	33	1.8883	75.8647	87.8992
2.1198	0.1818	34	1.8091	78.5714	89.4148
2.0272	0.1871	35	1.7300	80.8271	90.3877
1.9951	0.1924	36	1.6514	82.7068	91.2138
1.7741	0.1978	37	1.5736	84.9624	91.8920
1.9176	0.2031	38	1.4970	86.5414	92.3250
1.8599	0.2085	39	1.4219	87.5940	92.7578
1.8095	0.2138	40	1.3496	88.5714	93.0980
1.7814	0.2192	41	1.2790	90.0752	93.7737
1.4602	0.2245	42	1.2103	91.5038	94.6447
1.5147	0.2299	43	1.1431	92.1805	95.1039
1.4205	0.2352	44	1.0774	92.9323	95.4111
1.3222	0.2406	45	1.0127	93.9850	96.0199
1.2477	0.2459	46	0.9508	94.8120	96.5219
1.1406	0.2513	47	0.8936	95.2632	96.8391
1.1698	0.2566	48	0.8382	96.3158	97.5331
1.1359	0.2619	49	0.7847	97.0677	97.9841
1.1811	0.2673	50	0.7324	97.5940	98.4006
0.9734	0.2726	51	0.6814	97.7444	98.5321
0.928	0.2780	52	0.6318	97.8947	98.6140
0.8989	0.2833	53	0.5859	98.1203	98.7571
0.7784	0.2887	54	0.5430	98.3459	98.9243
1.0015	0.2940	55	0.5027	98.3459	98.8914
0.7509	0.2994	56	0.4656	98.5714	99.0811
0.6838	0.3047	57	0.4328	98.7970	99.1723
0.7336	0.3101	58	0.4042	98.8722	99.1327
0.5729	0.3154	59	0.3781	98.9474	99.2079
0.5891	0.3207	60	0.3538	99.0226	99.3362
0.6168	0.3261	61	0.3322	99.1729	99.4169
0.5503	0.3314	62	0.3130	99.1729	99.4169
0.5058	0.3368	63	0.2955	99.1729	99.4169
0.4065	0.3421	64	0.2788	99.3233	99.5000
0.4466	0.3475	65	0.2638	99.2481	99.4981
0.4727	0.3528	66	0.2496	99.2481	99.4981
0.45	0.3582	67	0.2365	99.2481	99.4981

Framework versions

Transformers 4.46.3
Pytorch 2.4.0
Datasets 3.1.0
Tokenizers 0.20.3

Mediocre-Judge
/

bengali_qa_model_AGGRO_banglabert

bengali_qa_model_AGGRO_banglabert

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Mediocre-Judge/bengali_qa_model_AGGRO_banglabert

Evaluation results