metadata

license: mit
base_model: roberta-large
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: roberta-large-sst-2-16-13
    results: []

roberta-large-sst-2-16-13

This model is a fine-tuned version of roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4022
Accuracy: 0.7812

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 150

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
No log	1.0	1	0.6926	0.5
No log	2.0	2	0.6926	0.5
No log	3.0	3	0.6926	0.5
No log	4.0	4	0.6926	0.5
No log	5.0	5	0.6926	0.5
No log	6.0	6	0.6926	0.5
No log	7.0	7	0.6925	0.5
No log	8.0	8	0.6925	0.5
No log	9.0	9	0.6925	0.5
0.6898	10.0	10	0.6925	0.5
0.6898	11.0	11	0.6924	0.5
0.6898	12.0	12	0.6924	0.5
0.6898	13.0	13	0.6924	0.5
0.6898	14.0	14	0.6924	0.5
0.6898	15.0	15	0.6923	0.5
0.6898	16.0	16	0.6923	0.5
0.6898	17.0	17	0.6922	0.5
0.6898	18.0	18	0.6922	0.5
0.6898	19.0	19	0.6922	0.5
0.694	20.0	20	0.6921	0.5
0.694	21.0	21	0.6921	0.5
0.694	22.0	22	0.6920	0.5
0.694	23.0	23	0.6920	0.5
0.694	24.0	24	0.6920	0.5
0.694	25.0	25	0.6919	0.5
0.694	26.0	26	0.6919	0.5
0.694	27.0	27	0.6918	0.5
0.694	28.0	28	0.6918	0.5
0.694	29.0	29	0.6918	0.5
0.7021	30.0	30	0.6917	0.5
0.7021	31.0	31	0.6916	0.5
0.7021	32.0	32	0.6916	0.5
0.7021	33.0	33	0.6916	0.5
0.7021	34.0	34	0.6915	0.5
0.7021	35.0	35	0.6915	0.5
0.7021	36.0	36	0.6914	0.5
0.7021	37.0	37	0.6914	0.5
0.7021	38.0	38	0.6913	0.5
0.7021	39.0	39	0.6913	0.5
0.6798	40.0	40	0.6913	0.5
0.6798	41.0	41	0.6912	0.5
0.6798	42.0	42	0.6911	0.5
0.6798	43.0	43	0.6910	0.5
0.6798	44.0	44	0.6909	0.5
0.6798	45.0	45	0.6908	0.5
0.6798	46.0	46	0.6907	0.5
0.6798	47.0	47	0.6906	0.5
0.6798	48.0	48	0.6905	0.5
0.6798	49.0	49	0.6903	0.5
0.6874	50.0	50	0.6902	0.5
0.6874	51.0	51	0.6901	0.5
0.6874	52.0	52	0.6899	0.5
0.6874	53.0	53	0.6898	0.5
0.6874	54.0	54	0.6896	0.5
0.6874	55.0	55	0.6895	0.5
0.6874	56.0	56	0.6894	0.5
0.6874	57.0	57	0.6893	0.5
0.6874	58.0	58	0.6892	0.5
0.6874	59.0	59	0.6890	0.5
0.6878	60.0	60	0.6889	0.5
0.6878	61.0	61	0.6888	0.5
0.6878	62.0	62	0.6886	0.5
0.6878	63.0	63	0.6885	0.5
0.6878	64.0	64	0.6884	0.5
0.6878	65.0	65	0.6884	0.5
0.6878	66.0	66	0.6883	0.5
0.6878	67.0	67	0.6882	0.5
0.6878	68.0	68	0.6882	0.5
0.6878	69.0	69	0.6881	0.5
0.6805	70.0	70	0.6880	0.5312
0.6805	71.0	71	0.6878	0.5312
0.6805	72.0	72	0.6877	0.5312
0.6805	73.0	73	0.6874	0.5312
0.6805	74.0	74	0.6872	0.5312
0.6805	75.0	75	0.6870	0.5312
0.6805	76.0	76	0.6868	0.5312
0.6805	77.0	77	0.6865	0.5312
0.6805	78.0	78	0.6862	0.5
0.6805	79.0	79	0.6860	0.5
0.6675	80.0	80	0.6857	0.5
0.6675	81.0	81	0.6853	0.5312
0.6675	82.0	82	0.6849	0.5312
0.6675	83.0	83	0.6845	0.5312
0.6675	84.0	84	0.6840	0.5312
0.6675	85.0	85	0.6834	0.5625
0.6675	86.0	86	0.6827	0.5625
0.6675	87.0	87	0.6818	0.5625
0.6675	88.0	88	0.6809	0.5625
0.6675	89.0	89	0.6798	0.5625
0.65	90.0	90	0.6786	0.5625
0.65	91.0	91	0.6772	0.5625
0.65	92.0	92	0.6758	0.5625
0.65	93.0	93	0.6741	0.5625
0.65	94.0	94	0.6718	0.5625
0.65	95.0	95	0.6687	0.5625
0.65	96.0	96	0.6649	0.5625
0.65	97.0	97	0.6615	0.5625
0.65	98.0	98	0.6596	0.5625
0.65	99.0	99	0.6605	0.5625
0.611	100.0	100	0.6642	0.5625
0.611	101.0	101	0.6683	0.5625
0.611	102.0	102	0.6689	0.5625
0.611	103.0	103	0.6670	0.5625
0.611	104.0	104	0.6627	0.5312
0.611	105.0	105	0.6595	0.5312
0.611	106.0	106	0.6577	0.5625
0.611	107.0	107	0.6575	0.5938
0.611	108.0	108	0.6552	0.5938
0.611	109.0	109	0.6555	0.625
0.5787	110.0	110	0.6560	0.625
0.5787	111.0	111	0.6566	0.625
0.5787	112.0	112	0.6560	0.625
0.5787	113.0	113	0.6543	0.6562
0.5787	114.0	114	0.6530	0.6562
0.5787	115.0	115	0.6518	0.6562
0.5787	116.0	116	0.6512	0.6562
0.5787	117.0	117	0.6506	0.6562
0.5787	118.0	118	0.6500	0.6562
0.5787	119.0	119	0.6499	0.6875
0.5279	120.0	120	0.6497	0.6875
0.5279	121.0	121	0.6496	0.6875
0.5279	122.0	122	0.6494	0.6875
0.5279	123.0	123	0.6486	0.6875
0.5279	124.0	124	0.6472	0.6875
0.5279	125.0	125	0.6443	0.6875
0.5279	126.0	126	0.6397	0.6562
0.5279	127.0	127	0.6328	0.6562
0.5279	128.0	128	0.6238	0.6875
0.5279	129.0	129	0.6173	0.6875
0.4721	130.0	130	0.6138	0.6875
0.4721	131.0	131	0.6175	0.625
0.4721	132.0	132	0.6137	0.6562
0.4721	133.0	133	0.6101	0.6562
0.4721	134.0	134	0.6062	0.6562
0.4721	135.0	135	0.6027	0.6562
0.4721	136.0	136	0.6015	0.625
0.4721	137.0	137	0.5982	0.625
0.4721	138.0	138	0.6102	0.625
0.4721	139.0	139	0.5983	0.625
0.378	140.0	140	0.6020	0.625
0.378	141.0	141	0.5921	0.625
0.378	142.0	142	0.5790	0.625
0.378	143.0	143	0.5654	0.6562
0.378	144.0	144	0.5493	0.6562
0.378	145.0	145	0.5279	0.6562
0.378	146.0	146	0.5064	0.6562
0.378	147.0	147	0.4834	0.6875
0.378	148.0	148	0.4557	0.7188
0.378	149.0	149	0.4318	0.75
0.2537	150.0	150	0.4022	0.7812

Framework versions

Transformers 4.32.0.dev0
Pytorch 2.0.1+cu118
Datasets 2.4.0
Tokenizers 0.13.3