1_7e-3_1_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.2572
Accuracy: 0.7505

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.007
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.0455	1.0	590	1.6132	0.3786
0.9655	2.0	1180	0.6681	0.6217
0.7392	3.0	1770	0.5308	0.4557
0.7812	4.0	2360	0.4957	0.5654
0.7422	5.0	2950	1.2018	0.6217
0.7053	6.0	3540	0.7295	0.4804
0.7016	7.0	4130	1.1783	0.3804
0.6381	8.0	4720	0.3895	0.6541
0.5364	9.0	5310	0.5057	0.6768
0.5598	10.0	5900	0.3659	0.6798
0.5779	11.0	6490	0.5754	0.6740
0.4901	12.0	7080	0.3128	0.7055
0.5212	13.0	7670	0.2977	0.7083
0.479	14.0	8260	1.0718	0.6352
0.4701	15.0	8850	0.4170	0.7138
0.4286	16.0	9440	0.3207	0.6985
0.4164	17.0	10030	0.2996	0.7086
0.3649	18.0	10620	0.3665	0.6823
0.4102	19.0	11210	0.2847	0.7300
0.3819	20.0	11800	0.3577	0.6731
0.3755	21.0	12390	0.5441	0.6058
0.3373	22.0	12980	0.6394	0.5657
0.3512	23.0	13570	0.2683	0.7159
0.3124	24.0	14160	0.2775	0.7269
0.3029	25.0	14750	0.3565	0.7333
0.2864	26.0	15340	0.5595	0.6318
0.3107	27.0	15930	0.8309	0.5557
0.2674	28.0	16520	0.2615	0.7394
0.2927	29.0	17110	0.6786	0.7049
0.2672	30.0	17700	0.2945	0.7407
0.2595	31.0	18290	0.3927	0.7327
0.2646	32.0	18880	0.2765	0.7162
0.2604	33.0	19470	0.2854	0.7199
0.2364	34.0	20060	0.3032	0.7034
0.2465	35.0	20650	0.3092	0.7456
0.2334	36.0	21240	0.5941	0.7248
0.2392	37.0	21830	0.3794	0.6875
0.2303	38.0	22420	0.3033	0.7235
0.2258	39.0	23010	0.3078	0.7266
0.2189	40.0	23600	0.3052	0.7425
0.2126	41.0	24190	0.3418	0.7352
0.2213	42.0	24780	0.2660	0.7382
0.2115	43.0	25370	0.4016	0.7364
0.2109	44.0	25960	0.3010	0.7456
0.2391	45.0	26550	0.4426	0.7303
0.2115	46.0	27140	0.2762	0.7407
0.2014	47.0	27730	0.2864	0.7437
0.1925	48.0	28320	0.2657	0.7382
0.2017	49.0	28910	0.2866	0.7505
0.2145	50.0	29500	0.3055	0.7202
0.1933	51.0	30090	0.5254	0.6550
0.2115	52.0	30680	0.2996	0.7477
0.1893	53.0	31270	0.2759	0.7471
0.1834	54.0	31860	0.2543	0.7440
0.1828	55.0	32450	0.2676	0.7492
0.1801	56.0	33040	0.2680	0.7505
0.1699	57.0	33630	0.2554	0.7440
0.1748	58.0	34220	0.3117	0.7505
0.1842	59.0	34810	0.3374	0.7483
0.1684	60.0	35400	0.2781	0.7471
0.1695	61.0	35990	0.3007	0.7434
0.177	62.0	36580	0.2816	0.7443
0.1586	63.0	37170	0.2587	0.7422
0.1643	64.0	37760	0.2751	0.7450
0.1719	65.0	38350	0.2875	0.7489
0.167	66.0	38940	0.2729	0.7434
0.1644	67.0	39530	0.2623	0.7373
0.16	68.0	40120	0.2534	0.7407
0.156	69.0	40710	0.2525	0.7419
0.1549	70.0	41300	0.2565	0.7297
0.1598	71.0	41890	0.2479	0.7425
0.1666	72.0	42480	0.3158	0.7462
0.1498	73.0	43070	0.2722	0.7456
0.1495	74.0	43660	0.3985	0.7428
0.153	75.0	44250	0.3153	0.7477
0.1576	76.0	44840	0.3075	0.7459
0.1536	77.0	45430	0.2629	0.7468
0.1508	78.0	46020	0.2489	0.7434
0.1502	79.0	46610	0.2671	0.7523
0.1509	80.0	47200	0.2771	0.7523
0.1352	81.0	47790	0.2611	0.7425
0.1438	82.0	48380	0.2556	0.7388
0.1407	83.0	48970	0.2809	0.7263
0.1417	84.0	49560	0.2580	0.7459
0.1404	85.0	50150	0.2557	0.7486
0.1437	86.0	50740	0.2821	0.7498
0.1368	87.0	51330	0.2766	0.7508
0.14	88.0	51920	0.2664	0.7498
0.1351	89.0	52510	0.2592	0.7450
0.1338	90.0	53100	0.2895	0.7514
0.1361	91.0	53690	0.2638	0.7526
0.1356	92.0	54280	0.2470	0.7468
0.1356	93.0	54870	0.2694	0.7511
0.1349	94.0	55460	0.2833	0.7502
0.1331	95.0	56050	0.2940	0.7477
0.131	96.0	56640	0.2760	0.7492
0.1311	97.0	57230	0.2520	0.7465
0.1282	98.0	57820	0.2604	0.7489
0.1258	99.0	58410	0.2518	0.7459
0.1331	100.0	59000	0.2572	0.7505

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_7e-3_1_0.9

1_7e-3_1_0.9

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_7e-3_1_0.9

Evaluation results