bert-to-distilbert-NER

This model is a fine-tuned version of dslim/bert-base-NER on the conll2003 dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
190.2685	1.0	110	127.2351	0.0157	0.0098	0.0120	0.7569
105.4389	2.0	220	97.1100	0.0281	0.0298	0.0289	0.7587
77.0337	3.0	330	76.9433	0.0136	0.0173	0.0152	0.7615
60.3477	4.0	440	65.9181	0.0130	0.0158	0.0143	0.7603
50.4086	5.0	550	58.5255	0.0170	0.0220	0.0192	0.7603
43.298	6.0	660	54.5405	0.0144	0.0187	0.0163	0.7594
39.0911	7.0	770	52.4767	0.0155	0.0195	0.0172	0.7613
35.07	8.0	880	49.1975	0.0170	0.0219	0.0192	0.7602
32.215	9.0	990	47.4422	0.0144	0.0187	0.0163	0.7599
29.9923	10.0	1100	46.5558	0.0167	0.0212	0.0187	0.7606
28.3599	11.0	1210	45.6301	0.0171	0.0214	0.0190	0.7613
26.8163	12.0	1320	45.0483	0.0141	0.0177	0.0157	0.7606
25.7434	13.0	1430	44.0639	0.0176	0.0222	0.0196	0.7605
24.9853	14.0	1540	43.6618	0.0148	0.0187	0.0165	0.7606
24.3179	15.0	1650	43.2398	0.0147	0.0187	0.0165	0.7599