mDeBERTa-v3-base-xnli-multilingual-nli-2mil7

This model is a fine-tuned version of MoritzLaurer/mDeBERTa-v3-base-xnli-multilingual-nli-2mil7 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	F1 Macro	F1 Micro	Accuracy Balanced	Accuracy	Precision Macro	Recall Macro	Precision Micro	Recall Micro
0.4669	0.17	200	0.4194	0.8011	0.8015	0.8068	0.8015	0.8029	0.8068	0.8015	0.8015
0.3921	0.34	400	0.4010	0.8139	0.8205	0.8095	0.8205	0.8283	0.8095	0.8205	0.8205
0.3468	0.51	600	0.3457	0.8459	0.8486	0.8445	0.8486	0.8478	0.8445	0.8486	0.8486
0.3299	0.68	800	0.3523	0.8595	0.8613	0.8598	0.8613	0.8593	0.8598	0.8613	0.8613
0.3192	0.85	1000	0.3372	0.8570	0.8592	0.8563	0.8592	0.8578	0.8563	0.8592	0.8592
0.3063	1.02	1200	0.3502	0.8594	0.8602	0.8627	0.8602	0.8585	0.8627	0.8602	0.8602
0.2481	1.19	1400	0.3579	0.8600	0.8624	0.8589	0.8624	0.8615	0.8589	0.8624	0.8624
0.2447	1.35	1600	0.3617	0.8636	0.8650	0.8649	0.8650	0.8628	0.8649	0.8650	0.8650
0.2496	1.52	1800	0.3494	0.8658	0.8677	0.8654	0.8677	0.8661	0.8654	0.8677	0.8677
0.2444	1.69	2000	0.3345	0.8644	0.8666	0.8635	0.8666	0.8656	0.8635	0.8666	0.8666
0.2217	1.86	2200	0.3452	0.8714	0.8724	0.8737	0.8724	0.8703	0.8737	0.8724	0.8724
0.2149	2.03	2400	0.3673	0.8727	0.8740	0.8737	0.8740	0.8719	0.8737	0.8740	0.8740
0.166	2.2	2600	0.3971	0.8731	0.8751	0.8723	0.8751	0.8741	0.8723	0.8751	0.8751
0.1685	2.37	2800	0.3884	0.8696	0.8714	0.8693	0.8714	0.8698	0.8693	0.8714	0.8714
0.1737	2.54	3000	0.3896	0.8674	0.8692	0.8672	0.8692	0.8676	0.8672	0.8692	0.8692
0.1667	2.71	3200	0.3950	0.8718	0.8735	0.8717	0.8735	0.8718	0.8717	0.8735	0.8735
0.1811	2.88	3400	0.3889	0.8707	0.8724	0.8708	0.8724	0.8707	0.8708	0.8724	0.8724

Datasets	asadfgglie/nli-zh-tw-all/test	asadfgglie/BanBan_2024-10-17-facial_expressions-nli/test	eval_dataset	test_dataset
eval_loss	0.365	0.29	0.389	0.35
eval_f1_macro	0.875	0.911	0.87	0.881
eval_f1_micro	0.876	0.911	0.871	0.881
eval_accuracy_balanced	0.875	0.911	0.87	0.881
eval_accuracy	0.876	0.911	0.871	0.881
eval_precision_macro	0.875	0.912	0.87	0.881
eval_recall_macro	0.875	0.911	0.87	0.881
eval_precision_micro	0.876	0.911	0.871	0.881
eval_recall_micro	0.876	0.911	0.871	0.881
eval_runtime	232.017	4.063	51.192	204.15
eval_samples_per_second	36.635	232.844	36.9	37.017
eval_steps_per_second	0.289	1.969	0.293	0.294
Size of dataset	8500	946	1889	7557