1_5e-3_1_0.9

This model is a fine-tuned version of bert-large-uncased on the super_glue dataset. It achieves the following results on the evaluation set:

Loss: 0.2534
Accuracy: 0.7483

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.005
train_batch_size: 16
eval_batch_size: 8
seed: 11
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.959	1.0	590	0.8539	0.6217
0.7269	2.0	1180	0.5087	0.6211
0.7153	3.0	1770	0.6070	0.4086
0.6861	4.0	2360	0.7139	0.6217
0.6453	5.0	2950	0.4409	0.6584
0.5962	6.0	3540	0.4496	0.5887
0.5361	7.0	4130	0.5527	0.5122
0.5468	8.0	4720	0.3969	0.6850
0.4902	9.0	5310	0.3556	0.6878
0.4794	10.0	5900	0.4762	0.6657
0.4719	11.0	6490	0.3936	0.6450
0.4317	12.0	7080	0.3662	0.7037
0.4179	13.0	7670	0.3144	0.6884
0.3817	14.0	8260	0.3086	0.7061
0.3867	15.0	8850	0.3868	0.7131
0.3573	16.0	9440	0.3145	0.7156
0.3413	17.0	10030	0.3493	0.6667
0.3458	18.0	10620	0.3274	0.6758
0.3212	19.0	11210	0.2809	0.7211
0.3182	20.0	11800	0.3024	0.7294
0.2971	21.0	12390	0.2963	0.6991
0.297	22.0	12980	0.2757	0.7089
0.276	23.0	13570	0.2705	0.7245
0.2741	24.0	14160	0.2971	0.6924
0.2651	25.0	14750	0.3400	0.7327
0.2635	26.0	15340	0.3080	0.6859
0.2578	27.0	15930	0.2861	0.7083
0.2479	28.0	16520	0.2751	0.7398
0.2466	29.0	17110	0.2798	0.7385
0.2461	30.0	17700	0.2627	0.7266
0.2355	31.0	18290	0.3146	0.7309
0.2315	32.0	18880	0.4802	0.7159
0.2258	33.0	19470	0.2626	0.7327
0.2192	34.0	20060	0.2806	0.7385
0.2217	35.0	20650	0.2837	0.7040
0.2126	36.0	21240	0.2950	0.7434
0.21	37.0	21830	0.3081	0.7419
0.2086	38.0	22420	0.2490	0.7343
0.2071	39.0	23010	0.2674	0.7437
0.2052	40.0	23600	0.3063	0.7413
0.2027	41.0	24190	0.2926	0.7410
0.2035	42.0	24780	0.2712	0.7398
0.1945	43.0	25370	0.2639	0.7367
0.1988	44.0	25960	0.2570	0.7370
0.1909	45.0	26550	0.2635	0.7361
0.1891	46.0	27140	0.2565	0.7358
0.1878	47.0	27730	0.2588	0.7367
0.1861	48.0	28320	0.2511	0.7294
0.1932	49.0	28910	0.2632	0.7422
0.1835	50.0	29500	0.2599	0.7398
0.1803	51.0	30090	0.2641	0.7379
0.1808	52.0	30680	0.2586	0.7355
0.174	53.0	31270	0.2502	0.7394
0.1774	54.0	31860	0.2650	0.7361
0.1804	55.0	32450	0.2486	0.7330
0.1814	56.0	33040	0.2919	0.7422
0.1679	57.0	33630	0.2837	0.7398
0.1665	58.0	34220	0.2751	0.7391
0.1732	59.0	34810	0.2575	0.7315
0.1666	60.0	35400	0.2518	0.7349
0.1667	61.0	35990	0.2582	0.7407
0.172	62.0	36580	0.2512	0.7373
0.1657	63.0	37170	0.2500	0.7364
0.1687	64.0	37760	0.2589	0.7419
0.1605	65.0	38350	0.2833	0.7434
0.1635	66.0	38940	0.2536	0.7343
0.1583	67.0	39530	0.2554	0.7416
0.1638	68.0	40120	0.2598	0.7462
0.1615	69.0	40710	0.3022	0.7407
0.16	70.0	41300	0.2653	0.7459
0.1601	71.0	41890	0.2593	0.7456
0.1567	72.0	42480	0.2564	0.7446
0.1503	73.0	43070	0.2788	0.7465
0.1531	74.0	43660	0.2518	0.7446
0.1536	75.0	44250	0.3032	0.7440
0.1549	76.0	44840	0.2513	0.7370
0.1543	77.0	45430	0.2647	0.7486
0.1516	78.0	46020	0.2511	0.7471
0.1512	79.0	46610	0.2562	0.7431
0.1493	80.0	47200	0.2568	0.7474
0.1443	81.0	47790	0.2650	0.7492
0.1487	82.0	48380	0.2488	0.7492
0.1453	83.0	48970	0.2444	0.7431
0.1465	84.0	49560	0.2665	0.7443
0.1444	85.0	50150	0.2531	0.7456
0.1487	86.0	50740	0.2475	0.7431
0.1425	87.0	51330	0.2774	0.7453
0.145	88.0	51920	0.2636	0.7465
0.1399	89.0	52510	0.2552	0.7459
0.1429	90.0	53100	0.2611	0.7443
0.1453	91.0	53690	0.2558	0.7468
0.1473	92.0	54280	0.2467	0.7413
0.1433	93.0	54870	0.2712	0.7474
0.1445	94.0	55460	0.2591	0.7465
0.1432	95.0	56050	0.2604	0.7486
0.1397	96.0	56640	0.2618	0.7492
0.1412	97.0	57230	0.2550	0.7483
0.1327	98.0	57820	0.2512	0.7471
0.136	99.0	58410	0.2525	0.7489
0.145	100.0	59000	0.2534	0.7483

Framework versions

Transformers 4.30.0
Pytorch 2.0.1+cu117
Datasets 2.14.4
Tokenizers 0.13.3

Onutoa
/

1_5e-3_1_0.9

1_5e-3_1_0.9

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train Onutoa/1_5e-3_1_0.9

Evaluation results