1_7e-3_5_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.8502
Accuracy: 0.7459

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.007
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
3.7211	1.0	590	2.9812	0.6211
3.5848	2.0	1180	2.9510	0.4997
3.6107	3.0	1770	2.7782	0.5235
3.0177	4.0	2360	2.2450	0.6483
2.8029	5.0	2950	2.7490	0.6462
2.6357	6.0	3540	1.8031	0.6554
2.6215	7.0	4130	2.3281	0.5838
2.329	8.0	4720	2.0869	0.6862
2.2143	9.0	5310	1.9625	0.6257
2.2128	10.0	5900	2.0803	0.6859
2.0857	11.0	6490	1.4649	0.6972
1.9328	12.0	7080	1.8434	0.6945
1.8594	13.0	7670	1.4225	0.6765
1.9315	14.0	8260	1.5322	0.7156
1.9249	15.0	8850	1.4720	0.7162
1.7274	16.0	9440	1.6171	0.6547
1.5474	17.0	10030	1.1592	0.7153
1.5032	18.0	10620	1.3276	0.7205
1.5738	19.0	11210	1.4631	0.6786
1.6749	20.0	11800	1.9620	0.6266
1.4133	21.0	12390	1.0952	0.7245
1.3552	22.0	12980	1.2053	0.7015
1.4104	23.0	13570	1.2010	0.7110
1.3108	24.0	14160	1.0470	0.7309
1.3339	25.0	14750	1.4671	0.7333
1.2143	26.0	15340	1.2387	0.6963
1.2473	27.0	15930	1.2540	0.7355
1.2602	28.0	16520	1.0843	0.7205
1.1832	29.0	17110	1.4378	0.6795
1.0999	30.0	17700	1.6722	0.7321
1.0803	31.0	18290	1.8755	0.7131
1.1358	32.0	18880	0.9925	0.7428
1.0867	33.0	19470	1.1163	0.7450
1.0661	34.0	20060	1.1009	0.7483
1.0572	35.0	20650	0.9747	0.7306
0.987	36.0	21240	1.1560	0.7440
1.0077	37.0	21830	1.0074	0.7086
0.9957	38.0	22420	0.9483	0.7291
0.9444	39.0	23010	1.0395	0.7248
0.9516	40.0	23600	1.0121	0.7315
0.9195	41.0	24190	0.9376	0.7398
0.9188	42.0	24780	1.1039	0.7135
0.9049	43.0	25370	1.2491	0.7391
0.9134	44.0	25960	0.9002	0.7346
0.8631	45.0	26550	1.1289	0.7419
0.8403	46.0	27140	1.0339	0.7416
0.8611	47.0	27730	1.2419	0.7443
0.84	48.0	28320	0.8991	0.7401
0.8795	49.0	28910	0.9157	0.7361
0.8211	50.0	29500	1.0039	0.7223
0.8124	51.0	30090	1.1785	0.7104
0.79	52.0	30680	0.9678	0.7385
0.7861	53.0	31270	0.9861	0.7330
0.7715	54.0	31860	0.9533	0.7419
0.8118	55.0	32450	1.0008	0.7125
0.7777	56.0	33040	0.9696	0.7278
0.738	57.0	33630	0.9313	0.7428
0.727	58.0	34220	1.3281	0.7410
0.7597	59.0	34810	1.0580	0.7498
0.7349	60.0	35400	0.8889	0.7343
0.7087	61.0	35990	0.8935	0.7370
0.7298	62.0	36580	0.9416	0.7511
0.7057	63.0	37170	0.8895	0.7428
0.704	64.0	37760	0.8649	0.7379
0.6907	65.0	38350	0.9054	0.7459
0.6721	66.0	38940	1.4102	0.7346
0.6932	67.0	39530	1.3254	0.7453
0.6944	68.0	40120	0.8969	0.7336
0.6504	69.0	40710	0.9343	0.7456
0.6984	70.0	41300	0.8656	0.7434
0.6804	71.0	41890	0.8744	0.7358
0.6684	72.0	42480	1.2043	0.7462
0.6591	73.0	43070	0.8612	0.7450
0.6259	74.0	43660	1.1547	0.7465
0.653	75.0	44250	0.9455	0.7474
0.6503	76.0	44840	0.8475	0.7391
0.65	77.0	45430	0.8667	0.7443
0.6442	78.0	46020	0.8617	0.7465
0.6237	79.0	46610	1.0127	0.7508
0.6149	80.0	47200	0.9956	0.7498
0.5893	81.0	47790	0.9385	0.7462
0.6139	82.0	48380	0.9122	0.7526
0.6117	83.0	48970	0.8413	0.7440
0.5999	84.0	49560	1.0049	0.7468
0.6091	85.0	50150	0.9213	0.7468
0.6049	86.0	50740	0.8642	0.7364
0.5976	87.0	51330	0.9368	0.7498
0.5927	88.0	51920	0.8736	0.7480
0.5698	89.0	52510	0.9112	0.7474
0.569	90.0	53100	0.8784	0.7437
0.5919	91.0	53690	0.8803	0.7456
0.5837	92.0	54280	0.8348	0.7413
0.5699	93.0	54870	0.8705	0.7477
0.5851	94.0	55460	0.8580	0.7471
0.5527	95.0	56050	0.8816	0.7495
0.5719	96.0	56640	0.8519	0.7495
0.5575	97.0	57230	0.8333	0.7450
0.5432	98.0	57820	0.8497	0.7446
0.5425	99.0	58410	0.8369	0.7474
0.5555	100.0	59000	0.8502	0.7459

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_7e-3_5_0.9

1_7e-3_5_0.9

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_7e-3_5_0.9

Evaluation results