2_1e-2_10_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.8066
Accuracy: 0.7550

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.01
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
4.8797	1.0	590	3.6447	0.6217
4.0161	2.0	1180	3.3996	0.6291
3.6981	3.0	1770	2.6674	0.6297
3.2078	4.0	2360	2.9843	0.5676
2.8064	5.0	2950	2.1131	0.6804
2.4341	6.0	3540	3.3843	0.6673
2.403	7.0	4130	1.8655	0.7043
2.3212	8.0	4720	1.8492	0.7055
2.2831	9.0	5310	1.5678	0.7024
2.1715	10.0	5900	1.6676	0.7193
2.0967	11.0	6490	1.5610	0.7174
1.9909	12.0	7080	1.3225	0.7122
1.9391	13.0	7670	1.3815	0.7180
1.852	14.0	8260	1.4632	0.7260
1.8568	15.0	8850	1.3623	0.7101
1.776	16.0	9440	1.3193	0.7015
1.6984	17.0	10030	1.3270	0.7208
1.6811	18.0	10620	1.3129	0.7055
1.6857	19.0	11210	1.3154	0.7382
1.6594	20.0	11800	1.2337	0.7352
1.5595	21.0	12390	1.2297	0.7404
1.6112	22.0	12980	1.1512	0.7450
1.5746	23.0	13570	1.1148	0.7208
1.5216	24.0	14160	1.1788	0.7373
1.5245	25.0	14750	1.0049	0.7361
1.4803	26.0	15340	1.5312	0.6890
1.5122	27.0	15930	1.0611	0.7187
1.4459	28.0	16520	1.5559	0.7431
1.4638	29.0	17110	1.3813	0.7450
1.3627	30.0	17700	1.0913	0.7456
1.3834	31.0	18290	1.1301	0.7113
1.3657	32.0	18880	1.2116	0.7560
1.373	33.0	19470	1.0198	0.7339
1.3113	34.0	20060	1.1041	0.7563
1.3327	35.0	20650	0.9885	0.7446
1.3544	36.0	21240	1.2174	0.7508
1.3198	37.0	21830	1.0094	0.7498
1.3	38.0	22420	0.9895	0.7306
1.2688	39.0	23010	1.0118	0.7471
1.3101	40.0	23600	1.1384	0.7517
1.2849	41.0	24190	1.1154	0.7520
1.2455	42.0	24780	0.9685	0.7431
1.2155	43.0	25370	1.0038	0.7498
1.2078	44.0	25960	0.9498	0.7382
1.2362	45.0	26550	0.9510	0.7413
1.2271	46.0	27140	0.9461	0.7514
1.2351	47.0	27730	0.9943	0.7272
1.2383	48.0	28320	0.9020	0.7422
1.1625	49.0	28910	0.9276	0.7385
1.1711	50.0	29500	0.9250	0.7352
1.1454	51.0	30090	0.9967	0.7483
1.1319	52.0	30680	0.9347	0.7309
1.1622	53.0	31270	0.9274	0.7456
1.1189	54.0	31860	1.0497	0.7483
1.1265	55.0	32450	0.9079	0.7462
1.0948	56.0	33040	0.9022	0.7477
1.0921	57.0	33630	0.8855	0.7385
1.0819	58.0	34220	0.8766	0.7327
1.0894	59.0	34810	0.8820	0.7462
1.0512	60.0	35400	0.8711	0.7428
1.075	61.0	35990	0.8970	0.7336
1.0505	62.0	36580	0.8912	0.7401
1.0612	63.0	37170	0.8774	0.7428
1.0458	64.0	37760	0.8675	0.7532
1.043	65.0	38350	1.0193	0.7554
1.1037	66.0	38940	0.8751	0.7367
1.0246	67.0	39530	0.8489	0.7514
1.0428	68.0	40120	0.8590	0.7373
1.0486	69.0	40710	0.8615	0.7514
1.0103	70.0	41300	0.9673	0.7596
1.0363	71.0	41890	0.8328	0.7440
1.0077	72.0	42480	0.8548	0.7489
1.0046	73.0	43070	0.9124	0.7407
0.9814	74.0	43660	0.8423	0.7508
0.9962	75.0	44250	1.0146	0.7532
0.9867	76.0	44840	0.8612	0.7517
0.9623	77.0	45430	0.8438	0.7563
0.9448	78.0	46020	0.8514	0.7505
0.961	79.0	46610	0.9149	0.7566
0.9521	80.0	47200	0.8576	0.7560
0.9835	81.0	47790	0.8314	0.7498
0.9777	82.0	48380	0.8524	0.7572
0.9259	83.0	48970	0.8440	0.7529
0.9246	84.0	49560	0.8429	0.7557
0.9222	85.0	50150	0.8880	0.7563
0.9152	86.0	50740	0.8348	0.7587
0.9218	87.0	51330	0.8254	0.7538
0.9379	88.0	51920	0.8099	0.7514
0.9387	89.0	52510	0.8407	0.7575
0.9154	90.0	53100	0.8735	0.7575
0.9331	91.0	53690	0.8920	0.7593
0.892	92.0	54280	0.8117	0.7566
0.9002	93.0	54870	0.8450	0.7569
0.9134	94.0	55460	0.7989	0.7569
0.8965	95.0	56050	0.8088	0.7541
0.8834	96.0	56640	0.8058	0.7529
0.9075	97.0	57230	0.8254	0.7557
0.8821	98.0	57820	0.8172	0.7547
0.9119	99.0	58410	0.8069	0.7550
0.9082	100.0	59000	0.8066	0.7550

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

2_1e-2_10_0.9

2_1e-2_10_0.9

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/2_1e-2_10_0.9

Evaluation results