distilbert-base-uncased-finetuned-synthetic-finetuned-synthetic

This model is a fine-tuned version of Chrisantha/distilbert-base-uncased-finetuned-synthetic on the None dataset. It achieves the following results on the evaluation set:

Loss: 1.4081

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	1	2.9242
0.5836	2.0	2	2.5911
0.5836	3.0	3	2.7194
0.782	4.0	4	2.3194
0.782	5.0	5	2.1952
1.3155	6.0	6	2.1321
1.3155	7.0	7	2.2769
0.596	8.0	8	2.2093
0.596	9.0	9	2.4133
0.817	10.0	10	2.4370
0.817	11.0	11	2.1859
0.7962	12.0	12	2.1760
0.7962	13.0	13	1.9116
0.7554	14.0	14	1.7670
0.7554	15.0	15	1.7386
0.4256	16.0	16	1.6506
0.4256	17.0	17	1.5478
0.6326	18.0	18	1.5998
0.6326	19.0	19	1.6936
0.493	20.0	20	1.6938
0.493	21.0	21	1.7659
0.5194	22.0	22	1.8872
0.5194	23.0	23	1.7004
0.4438	24.0	24	1.6653
0.4438	25.0	25	1.5889
0.5761	26.0	26	1.4914
0.5761	27.0	27	1.3813
0.395	28.0	28	1.4385
0.395	29.0	29	1.4067
0.4681	30.0	30	1.4021
0.4681	31.0	31	1.4172
0.6326	32.0	32	1.4502
0.6326	33.0	33	1.5628
0.3545	34.0	34	1.6276
0.3545	35.0	35	1.6164
0.4313	36.0	36	1.7040
0.4313	37.0	37	1.6950
0.3883	38.0	38	1.6429
0.3883	39.0	39	1.6180
0.5155	40.0	40	1.5417
0.5155	41.0	41	1.4499
0.3546	42.0	42	1.3885
0.3546	43.0	43	1.3061
0.2205	44.0	44	1.2986
0.2205	45.0	45	1.2861
0.2851	46.0	46	1.3785
0.2851	47.0	47	1.4008
0.3057	48.0	48	1.4402
0.3057	49.0	49	1.4538
0.3449	50.0	50	1.5073
0.3449	51.0	51	1.5050
0.1664	52.0	52	1.4939
0.1664	53.0	53	1.4691
0.1484	54.0	54	1.2829
0.1484	55.0	55	1.3112
0.3156	56.0	56	1.2328
0.3156	57.0	57	1.1700
0.379	58.0	58	1.1190
0.379	59.0	59	1.1429
0.2475	60.0	60	1.1544
0.2475	61.0	61	1.2303
0.2282	62.0	62	1.3118
0.2282	63.0	63	1.3701
0.2216	64.0	64	1.3705
0.2216	65.0	65	1.4848
0.1768	66.0	66	1.4744
0.1768	67.0	67	1.5796
0.1621	68.0	68	1.5674
0.1621	69.0	69	1.5873
0.3016	70.0	70	1.5756
0.3016	71.0	71	1.6496
0.2548	72.0	72	1.5922
0.2548	73.0	73	1.5911
0.2878	74.0	74	1.4912
0.2878	75.0	75	1.5303
0.2045	76.0	76	1.5293
0.2045	77.0	77	1.4076
0.219	78.0	78	1.4773
0.219	79.0	79	1.3878
0.1396	80.0	80	1.3349
0.1396	81.0	81	1.3670
0.166	82.0	82	1.4015
0.166	83.0	83	1.4132
0.2982	84.0	84	1.4478
0.2982	85.0	85	1.4803
0.1199	86.0	86	1.4667
0.1199	87.0	87	1.5402
0.1982	88.0	88	1.5515
0.1982	89.0	89	1.5189
0.1816	90.0	90	1.5545
0.1816	91.0	91	1.4814
0.1779	92.0	92	1.4943
0.1779	93.0	93	1.4430
0.0785	94.0	94	1.4865
0.0785	95.0	95	1.4919
0.1108	96.0	96	1.5035
0.1108	97.0	97	1.4088
0.2581	98.0	98	1.4104
0.2581	99.0	99	1.4549
0.1738	100.0	100	1.3761

Framework versions

Transformers 4.41.1
Pytorch 2.3.0+cu121
Datasets 2.19.1
Tokenizers 0.19.1

Chrisantha
/

distilbert-base-uncased-finetuned-synthetic-finetuned-synthetic

distilbert-base-uncased-finetuned-synthetic-finetuned-synthetic

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for Chrisantha/distilbert-base-uncased-finetuned-synthetic-finetuned-synthetic

Evaluation results