scc_rm

This model is a fine-tuned version of FacebookAI/roberta-base on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 128
eval_batch_size: 128
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 30

Training Loss	Epoch	Step	Validation Loss	Mse
0.031	1.0	21	0.0183	0.0183
0.0208	2.0	42	0.0107	0.0107
0.0135	3.0	63	0.0092	0.0092
0.0122	4.0	84	0.0129	0.0129
0.0101	5.0	105	0.0067	0.0067
0.0084	6.0	126	0.0083	0.0083
0.0077	7.0	147	0.0057	0.0057
0.0065	8.0	168	0.0074	0.0074
0.0061	9.0	189	0.0077	0.0077
0.0062	10.0	210	0.0068	0.0068
0.0047	11.0	231	0.0058	0.0058
0.0042	12.0	252	0.0072	0.0072
0.0039	13.0	273	0.0065	0.0065
0.0039	14.0	294	0.0064	0.0064
0.004	15.0	315	0.0073	0.0073
0.0039	16.0	336	0.0090	0.0090
0.004	17.0	357	0.0066	0.0066
0.0035	18.0	378	0.0070	0.0070
0.0031	19.0	399	0.0082	0.0082
0.0032	20.0	420	0.0053	0.0053
0.0032	21.0	441	0.0055	0.0055
0.0034	22.0	462	0.0056	0.0056
0.0027	23.0	483	0.0065	0.0065
0.0024	24.0	504	0.0058	0.0058
0.0026	25.0	525	0.0060	0.0060
0.0027	26.0	546	0.0061	0.0061
0.0026	27.0	567	0.0068	0.0068
0.0025	28.0	588	0.0060	0.0060
0.0022	29.0	609	0.0063	0.0063
0.0023	30.0	630	0.0060	0.0060

Safetensors

Model size

0.1B params

Tensor type

F32

Base model

Finetuned

this model