herbert-large-cased_nli

This model is a fine-tuned version of allegro/herbert-large-cased on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	625	0.6466	0.751
No log	2.0	1250	0.5856	0.79
0.5915	3.0	1875	0.6142	0.761
0.5915	4.0	2500	0.6803	0.78
0.4204	5.0	3125	0.7207	0.786
0.4204	6.0	3750	0.7956	0.777
0.4204	7.0	4375	0.7964	0.787
0.306	8.0	5000	0.7869	0.766
0.306	9.0	5625	0.8671	0.766
0.2192	10.0	6250	0.8832	0.778
0.2192	11.0	6875	0.9147	0.768
0.1595	12.0	7500	1.1113	0.756
0.1595	13.0	8125	1.0984	0.761
0.1595	14.0	8750	1.3107	0.758
0.1288	15.0	9375	1.2892	0.764
0.1288	16.0	10000	1.5291	0.741
0.1037	17.0	10625	1.2105	0.786
0.1037	18.0	11250	1.3468	0.78
0.1037	19.0	11875	1.5642	0.758
0.0864	20.0	12500	1.5304	0.768
0.0864	21.0	13125	1.4310	0.776
0.0728	22.0	13750	1.5636	0.762
0.0728	23.0	14375	1.5032	0.766
0.0583	24.0	15000	1.7275	0.763
0.0583	25.0	15625	1.6669	0.758
0.0583	26.0	16250	1.6029	0.767
0.0453	27.0	16875	1.6239	0.771
0.0453	28.0	17500	1.6007	0.781
0.0335	29.0	18125	1.7028	0.766
0.0335	30.0	18750	1.8058	0.776
0.0335	31.0	19375	1.7894	0.766
0.0267	32.0	20000	1.8930	0.765
0.0267	33.0	20625	1.8582	0.775
0.022	34.0	21250	1.9610	0.764
0.022	35.0	21875	2.0128	0.775
0.0163	36.0	22500	2.0248	0.773
0.0163	37.0	23125	2.0203	0.77
0.0163	38.0	23750	2.0615	0.77
0.0115	39.0	24375	2.0787	0.769
0.0115	40.0	25000	2.0905	0.77