toy

This model is a fine-tuned version of distilbert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
0.4798	1.0	231	0.2252
0.3378	2.0	462	0.1777
0.1024	3.0	693	0.1586
0.0736	4.0	924	0.1664
0.1237	5.0	1155	0.1692
0.1049	6.0	1386	0.1818
0.0239	7.0	1617	0.2127
0.0036	8.0	1848	0.1888
0.0051	9.0	2079	0.2061
0.0003	10.0	2310	0.1905
0.0005	11.0	2541	0.2011
0.0003	12.0	2772	0.1928
0.0029	13.0	3003	0.2563
0.0002	14.0	3234	0.2076
0.0002	15.0	3465	0.1980
0.0001	16.0	3696	0.2013
0.0001	17.0	3927	0.2089
0.0001	18.0	4158	0.1984
0.0001	19.0	4389	0.2017
0.0001	20.0	4620	0.2013
0.0001	21.0	4851	0.2142
0.0001	22.0	5082	0.1943
0.0001	23.0	5313	0.2003
0.0	24.0	5544	0.2015
0.0001	25.0	5775	0.2031
0.0002	26.0	6006	0.2600
0.0022	27.0	6237	0.2269
0.0	28.0	6468	0.2125
0.0	29.0	6699	0.2172
0.0	30.0	6930	0.2185
0.0	31.0	7161	0.2004
0.0	32.0	7392	0.2077
0.0	33.0	7623	0.2333
0.0003	34.0	7854	0.2102
0.0	35.0	8085	0.2095
0.0	36.0	8316	0.2030
0.0	37.0	8547	0.2038
0.0	38.0	8778	0.2062
0.0	39.0	9009	0.2080
0.0	40.0	9240	0.2083
0.0	41.0	9471	0.2063
0.0	42.0	9702	0.2146
0.0	43.0	9933	0.2168
0.0	44.0	10164	0.2112
0.0	45.0	10395	0.2109
0.0	46.0	10626	0.2116
0.0	47.0	10857	0.2122
0.0	48.0	11088	0.2122
0.0	49.0	11319	0.2124
0.0	50.0	11550	0.2124