distilbert-classn-LinearAlg-finetuned-span-width-2

This model is a fine-tuned version of dslim/distilbert-NER on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.8927
Accuracy: 0.7698
F1: 0.7669
Precision: 0.7824
Recall: 0.7698

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 4
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 25
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall
4.8367	0.6849	50	2.4596	0.0794	0.0714	0.0958	0.0794
4.9882	1.3699	100	2.4445	0.0794	0.0672	0.0879	0.0794
4.8852	2.0548	150	2.4040	0.0873	0.0904	0.1342	0.0873
4.7843	2.7397	200	2.3744	0.1429	0.1481	0.2396	0.1429
4.752	3.4247	250	2.3612	0.1032	0.1062	0.1491	0.1032
4.6277	4.1096	300	2.3446	0.1587	0.1570	0.1976	0.1587
4.4488	4.7945	350	2.2895	0.1746	0.1760	0.2217	0.1746
4.4244	5.4795	400	2.2383	0.2302	0.2282	0.3192	0.2302
3.9882	6.1644	450	2.1156	0.2381	0.2338	0.2955	0.2381
3.7244	6.8493	500	1.9715	0.3730	0.3763	0.4472	0.3730
3.2134	7.5342	550	1.8718	0.4206	0.3950	0.4017	0.4206
2.9113	8.2192	600	1.7821	0.4127	0.4249	0.5411	0.4127
2.4754	8.9041	650	1.6155	0.4841	0.4828	0.5088	0.4841
1.9316	9.5890	700	1.4559	0.5714	0.5673	0.5759	0.5714
1.6141	10.2740	750	1.2770	0.6429	0.6300	0.6630	0.6429
1.1867	10.9589	800	1.1722	0.6508	0.6439	0.6649	0.6508
0.9252	11.6438	850	1.0998	0.6825	0.6830	0.7084	0.6825
0.764	12.3288	900	1.0359	0.7143	0.7181	0.7575	0.7143
0.5821	13.0137	950	0.9742	0.7302	0.7288	0.7554	0.7302
0.4689	13.6986	1000	0.9252	0.7460	0.7459	0.7639	0.7460
0.3578	14.3836	1050	0.9470	0.7302	0.7281	0.7663	0.7302
0.2932	15.0685	1100	0.9157	0.7222	0.7181	0.7552	0.7222
0.2262	15.7534	1150	0.8814	0.7540	0.7525	0.7723	0.7540
0.2127	16.4384	1200	0.8926	0.7381	0.7349	0.7488	0.7381
0.1445	17.1233	1250	0.8955	0.7698	0.7672	0.7891	0.7698
0.1183	17.8082	1300	0.8903	0.7698	0.7648	0.8007	0.7698
0.0757	18.4932	1350	0.8743	0.7698	0.7656	0.7831	0.7698
0.0939	19.1781	1400	0.8584	0.8016	0.8032	0.8200	0.8016
0.0705	19.8630	1450	0.8636	0.7857	0.7849	0.7965	0.7857
0.0605	20.5479	1500	0.8750	0.7778	0.7743	0.7831	0.7778
0.0467	21.2329	1550	0.8834	0.7778	0.7762	0.7898	0.7778
0.0777	21.9178	1600	0.8909	0.7698	0.7668	0.7809	0.7698
0.0349	22.6027	1650	0.8852	0.7698	0.7669	0.7824	0.7698
0.0442	23.2877	1700	0.8873	0.7698	0.7669	0.7824	0.7698
0.0253	23.9726	1750	0.8917	0.7698	0.7669	0.7824	0.7698
0.0335	24.6575	1800	0.8927	0.7698	0.7669	0.7824	0.7698

Framework versions

Transformers 4.48.3
Pytorch 2.5.1+cu124
Datasets 3.3.2
Tokenizers 0.21.0

Heather-Driver
/

distilbert-classn-LinearAlg-finetuned-span-width-2

distilbert-classn-LinearAlg-finetuned-span-width-2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Heather-Driver/distilbert-classn-LinearAlg-finetuned-span-width-2

Evaluation results