roberta-large-sst-2-16-13-smoothed

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6487
Accuracy: 0.75

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 50
num_epochs: 75
label_smoothing_factor: 0.45

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	1	0.7106	0.5
No log	2.0	2	0.7104	0.5
No log	3.0	3	0.7100	0.5
No log	4.0	4	0.7094	0.5
No log	5.0	5	0.7087	0.5
No log	6.0	6	0.7077	0.5
No log	7.0	7	0.7066	0.5
No log	8.0	8	0.7054	0.5
No log	9.0	9	0.7040	0.5
0.7172	10.0	10	0.7026	0.5
0.7172	11.0	11	0.7011	0.5
0.7172	12.0	12	0.6995	0.5
0.7172	13.0	13	0.6980	0.5
0.7172	14.0	14	0.6965	0.5312
0.7172	15.0	15	0.6951	0.5312
0.7172	16.0	16	0.6936	0.5312
0.7172	17.0	17	0.6921	0.5312
0.7172	18.0	18	0.6906	0.5312
0.7172	19.0	19	0.6895	0.5312
0.6997	20.0	20	0.6884	0.5312
0.6997	21.0	21	0.6874	0.5312
0.6997	22.0	22	0.6867	0.5625
0.6997	23.0	23	0.6860	0.5312
0.6997	24.0	24	0.6854	0.5938
0.6997	25.0	25	0.6846	0.6562
0.6997	26.0	26	0.6840	0.625
0.6997	27.0	27	0.6832	0.6562
0.6997	28.0	28	0.6826	0.6875
0.6997	29.0	29	0.6815	0.6875
0.6874	30.0	30	0.6804	0.6875
0.6874	31.0	31	0.6790	0.6875
0.6874	32.0	32	0.6772	0.6875
0.6874	33.0	33	0.6762	0.6562
0.6874	34.0	34	0.6753	0.6562
0.6874	35.0	35	0.6738	0.6875
0.6874	36.0	36	0.6725	0.6875
0.6874	37.0	37	0.6696	0.6875
0.6874	38.0	38	0.6687	0.6875
0.6874	39.0	39	0.6665	0.6875
0.6594	40.0	40	0.6643	0.6875
0.6594	41.0	41	0.6674	0.6875
0.6594	42.0	42	0.6733	0.6875
0.6594	43.0	43	0.6804	0.6875
0.6594	44.0	44	0.6731	0.6875
0.6594	45.0	45	0.6701	0.6875
0.6594	46.0	46	0.6687	0.6875
0.6594	47.0	47	0.6687	0.6562
0.6594	48.0	48	0.6757	0.625
0.6594	49.0	49	0.6739	0.6875
0.6089	50.0	50	0.6766	0.6875
0.6089	51.0	51	0.6724	0.6875
0.6089	52.0	52	0.6662	0.6875
0.6089	53.0	53	0.6664	0.6875
0.6089	54.0	54	0.6602	0.6875
0.6089	55.0	55	0.6505	0.6875
0.6089	56.0	56	0.6468	0.75
0.6089	57.0	57	0.6370	0.75
0.6089	58.0	58	0.6285	0.7812
0.6089	59.0	59	0.6267	0.7812
0.5694	60.0	60	0.6279	0.7812
0.5694	61.0	61	0.6364	0.7812
0.5694	62.0	62	0.6443	0.75
0.5694	63.0	63	0.6518	0.7812
0.5694	64.0	64	0.6634	0.7188
0.5694	65.0	65	0.6647	0.7188
0.5694	66.0	66	0.6679	0.7188
0.5694	67.0	67	0.6669	0.7188
0.5694	68.0	68	0.6626	0.7188
0.5694	69.0	69	0.6624	0.75
0.5618	70.0	70	0.6614	0.7188
0.5618	71.0	71	0.6592	0.75
0.5618	72.0	72	0.6571	0.75
0.5618	73.0	73	0.6541	0.75
0.5618	74.0	74	0.6499	0.75
0.5618	75.0	75	0.6487	0.75

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3

simonycl
/

roberta-large-sst-2-16-13-smoothed

roberta-large-sst-2-16-13-smoothed

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for simonycl/roberta-large-sst-2-16-13-smoothed

Evaluation results