bert-to-distilbert-NER

This model is a fine-tuned version of dslim/bert-base-NER on the conll2003 dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Precision	Recall	F1	Accuracy
201.4012	1.0	110	133.7231	0.0153	0.0106	0.0125	0.7539
106.9317	2.0	220	99.3629	0.0266	0.0305	0.0284	0.7593
81.3601	3.0	330	80.3763	0.0159	0.0214	0.0183	0.7604
63.8325	4.0	440	67.7620	0.0179	0.0244	0.0207	0.7599
52.0271	5.0	550	59.0806	0.0203	0.0268	0.0231	0.7598
44.4419	6.0	660	55.3208	0.0211	0.0278	0.0240	0.7603
39.2351	7.0	770	52.4510	0.0170	0.0222	0.0193	0.7598
35.3438	8.0	880	50.4576	0.0205	0.0268	0.0232	0.7604
32.7385	9.0	990	48.3418	0.0173	0.0227	0.0197	0.7595
30.6531	10.0	1100	46.7304	0.0147	0.0188	0.0165	0.7600
29.0811	11.0	1210	46.3386	0.0151	0.0190	0.0168	0.7599
27.9501	12.0	1320	45.4516	0.0163	0.0204	0.0181	0.7604
26.7452	13.0	1430	44.3425	0.0154	0.0199	0.0173	0.7592
25.5367	14.0	1540	44.0415	0.0146	0.0190	0.0165	0.7594
24.5507	15.0	1650	44.0386	0.0145	0.0185	0.0163	0.7597