1_9e-3_1_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.2451
Accuracy: 0.7456

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.009
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.4019	1.0	590	0.7035	0.3902
0.8201	2.0	1180	0.6123	0.6205
0.8196	3.0	1770	0.5319	0.4587
0.8006	4.0	2360	0.8985	0.4006
0.6419	5.0	2950	0.4274	0.6352
0.5396	6.0	3540	0.4366	0.6015
0.5199	7.0	4130	0.4763	0.5951
0.4784	8.0	4720	0.3445	0.6890
0.442	9.0	5310	0.3407	0.6979
0.4292	10.0	5900	0.5351	0.6725
0.408	11.0	6490	0.3183	0.7110
0.3802	12.0	7080	0.4297	0.6199
0.3826	13.0	7670	0.3199	0.6807
0.3608	14.0	8260	0.2910	0.7214
0.3485	15.0	8850	0.3491	0.7220
0.3481	16.0	9440	0.3000	0.7223
0.3326	17.0	10030	0.2859	0.7147
0.328	18.0	10620	0.3148	0.6859
0.315	19.0	11210	0.2922	0.7352
0.3123	20.0	11800	0.3455	0.7358
0.2956	21.0	12390	0.2757	0.7211
0.2983	22.0	12980	0.2832	0.7278
0.2798	23.0	13570	0.2676	0.7260
0.2765	24.0	14160	0.2925	0.7021
0.2761	25.0	14750	0.3859	0.7336
0.2646	26.0	15340	0.3827	0.6517
0.2608	27.0	15930	0.2672	0.7284
0.2474	28.0	16520	0.2737	0.7456
0.2487	29.0	17110	0.2770	0.7468
0.2445	30.0	17700	0.2732	0.7174
0.2455	31.0	18290	0.2902	0.7410
0.236	32.0	18880	0.3913	0.7352
0.228	33.0	19470	0.2819	0.7410
0.2182	34.0	20060	0.2863	0.7453
0.2203	35.0	20650	0.3294	0.6988
0.2189	36.0	21240	0.2809	0.7398
0.2132	37.0	21830	0.2631	0.7413
0.2082	38.0	22420	0.2600	0.7315
0.2144	39.0	23010	0.2841	0.7489
0.2128	40.0	23600	0.2650	0.7321
0.1978	41.0	24190	0.2795	0.7440
0.2044	42.0	24780	0.2650	0.7349
0.1936	43.0	25370	0.2666	0.7385
0.1967	44.0	25960	0.2861	0.7440
0.1823	45.0	26550	0.2658	0.7358
0.1868	46.0	27140	0.2834	0.7477
0.1914	47.0	27730	0.2882	0.7407
0.1962	48.0	28320	0.2547	0.7425
0.1916	49.0	28910	0.2586	0.7407
0.181	50.0	29500	0.2629	0.7413
0.1817	51.0	30090	0.2637	0.7373
0.1795	52.0	30680	0.2719	0.7422
0.1682	53.0	31270	0.2583	0.7483
0.1736	54.0	31860	0.2547	0.7327
0.17	55.0	32450	0.2580	0.7349
0.1775	56.0	33040	0.2583	0.7459
0.1678	57.0	33630	0.2681	0.7431
0.1648	58.0	34220	0.2652	0.7404
0.1679	59.0	34810	0.2602	0.7333
0.1637	60.0	35400	0.2563	0.7407
0.1627	61.0	35990	0.2611	0.7416
0.1678	62.0	36580	0.2558	0.7425
0.1588	63.0	37170	0.2578	0.7275
0.1626	64.0	37760	0.2602	0.7352
0.164	65.0	38350	0.2562	0.7446
0.1584	66.0	38940	0.2556	0.7367
0.1514	67.0	39530	0.2784	0.7437
0.1567	68.0	40120	0.2643	0.7446
0.1528	69.0	40710	0.2715	0.7456
0.1593	70.0	41300	0.2611	0.7505
0.1578	71.0	41890	0.2539	0.7388
0.1515	72.0	42480	0.2850	0.7526
0.1485	73.0	43070	0.2831	0.7505
0.1509	74.0	43660	0.2723	0.7486
0.1532	75.0	44250	0.3408	0.7446
0.1512	76.0	44840	0.2522	0.7419
0.1463	77.0	45430	0.2491	0.7422
0.1481	78.0	46020	0.2477	0.7443
0.1467	79.0	46610	0.2524	0.7401
0.1457	80.0	47200	0.2688	0.7505
0.1393	81.0	47790	0.2564	0.7422
0.1454	82.0	48380	0.2520	0.7465
0.1409	83.0	48970	0.2517	0.7425
0.1395	84.0	49560	0.2479	0.7453
0.1382	85.0	50150	0.2524	0.7520
0.1394	86.0	50740	0.2546	0.7520
0.1353	87.0	51330	0.2693	0.7526
0.1343	88.0	51920	0.2503	0.7483
0.1337	89.0	52510	0.2480	0.7486
0.1357	90.0	53100	0.2605	0.7517
0.1365	91.0	53690	0.2481	0.7477
0.1359	92.0	54280	0.2440	0.7446
0.1354	93.0	54870	0.2572	0.7535
0.1357	94.0	55460	0.2521	0.7471
0.1329	95.0	56050	0.2558	0.7514
0.1325	96.0	56640	0.2475	0.7450
0.131	97.0	57230	0.2446	0.7446
0.1275	98.0	57820	0.2446	0.7434
0.1244	99.0	58410	0.2437	0.7425
0.1357	100.0	59000	0.2451	0.7456

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_9e-3_1_0.9

1_9e-3_1_0.9

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_9e-3_1_0.9

Evaluation results