bert-large-cased-sigir-LR100-1-cased-150

This model is a fine-tuned version of bert-large-cased on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 2.4908

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.00015
train_batch_size: 30
eval_batch_size: 30
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 150.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
7.0223	1.0	1	6.3939
7.139	2.0	2	7.0236
7.0156	3.0	3	7.0893
7.2089	4.0	4	5.3251
5.7375	5.0	5	3.5431
3.8594	6.0	6	3.2612
3.2379	7.0	7	2.8294
3.2598	8.0	8	3.1189
3.0582	9.0	9	2.3748
2.6474	10.0	10	3.0741
3.0331	11.0	11	1.8630
2.7435	12.0	12	2.5527
2.3269	13.0	13	2.1300
2.0739	14.0	14	2.6337
1.9651	15.0	15	1.8872
2.1553	16.0	16	2.2753
1.828	17.0	17	2.6648
1.6217	18.0	18	2.5176
1.681	19.0	19	1.0566
1.5415	20.0	20	1.7287
1.1874	21.0	21	1.8245
1.1892	22.0	22	3.3104
1.3973	23.0	23	1.7072
1.132	24.0	24	2.7579
1.2451	25.0	25	2.0380
1.2669	26.0	26	2.7239
1.1353	27.0	27	2.3039
1.319	28.0	28	1.9489
1.1246	29.0	29	2.1581
1.332	30.0	30	2.5226
1.2593	31.0	31	2.1640
1.2299	32.0	32	1.1443
1.0047	33.0	33	1.5572
0.7948	34.0	34	1.8577
1.151	35.0	35	1.9391
0.905	36.0	36	2.2505
1.0517	37.0	37	2.4417
1.1055	38.0	38	1.9173
0.837	39.0	39	2.2970
0.911	40.0	40	1.6180
0.7643	41.0	41	1.4066
0.8754	42.0	42	2.7884
1.2184	43.0	43	2.3925
0.8623	44.0	44	2.0238
0.7886	45.0	45	2.7872
0.7089	46.0	46	2.7488
0.5684	47.0	47	2.3370
0.7299	48.0	48	2.1650
0.6906	49.0	49	2.7732
0.6876	50.0	50	3.0168
0.3967	51.0	51	2.1091
0.6191	52.0	52	1.0602
0.5068	53.0	53	3.1522
0.7189	54.0	54	3.0988
0.6768	55.0	55	1.7392
0.6247	56.0	56	1.8365
0.5057	57.0	57	2.8095
0.65	58.0	58	2.8217
0.5398	59.0	59	2.3459
0.5703	60.0	60	1.6460
0.5543	61.0	61	1.7489
0.5031	62.0	62	2.4358
0.447	63.0	63	2.5432
0.5056	64.0	64	3.0374
0.5161	65.0	65	2.2071
0.4509	66.0	66	3.4950
0.3973	67.0	67	3.0061
0.4469	68.0	68	2.7547
0.4062	69.0	69	3.3072
0.2518	70.0	70	2.1072
0.2339	71.0	71	2.1484
0.5058	72.0	72	3.2433
0.2827	73.0	73	1.6271
0.3282	74.0	74	2.1436
0.3415	75.0	75	2.8307
0.437	76.0	76	2.5679
0.5487	77.0	77	1.1248
0.3464	78.0	78	3.0531
0.3801	79.0	79	2.9731
0.3805	80.0	80	1.8667
0.2179	81.0	81	3.1658
0.2429	82.0	82	1.8698
0.3341	83.0	83	2.6730
0.2662	84.0	84	2.1040
0.4266	85.0	85	3.5671
0.3154	86.0	86	2.4055
0.2319	87.0	87	1.4615
0.1723	88.0	88	2.7438
0.3301	89.0	89	2.8391
0.2512	90.0	90	2.3172
0.3017	91.0	91	2.9586
0.4352	92.0	92	2.3134
0.3637	93.0	93	2.2590
0.3656	94.0	94	3.0915
0.3036	95.0	95	2.0150
0.2135	96.0	96	3.3688
0.3807	97.0	97	2.3753
0.1097	98.0	98	2.7441
0.1848	99.0	99	2.2326
0.2767	100.0	100	2.3659
0.1802	101.0	101	1.9263
0.3092	102.0	102	2.4455
0.2286	103.0	103	3.0056
0.2489	104.0	104	2.1951
0.2924	105.0	105	2.5564
0.2349	106.0	106	2.9558
0.421	107.0	107	3.6553
0.3556	108.0	108	2.4322
0.256	109.0	109	2.4172
0.0799	110.0	110	2.0181
0.2025	111.0	111	3.1769
0.2607	112.0	112	2.7245
0.3622	113.0	113	3.2589
0.1399	114.0	114	3.1256
0.262	115.0	115	2.5244
0.1744	116.0	116	2.3620
0.1238	117.0	117	2.8665
0.1841	118.0	118	3.4656
0.1641	119.0	119	2.8976
0.1843	120.0	120	3.4858
0.3355	121.0	121	4.1324
0.1236	122.0	122	3.1793
0.3187	123.0	123	3.3354
0.2337	124.0	124	3.2495
0.1969	125.0	125	3.0704
0.0901	126.0	126	1.3137
0.27	127.0	127	3.2272
0.1994	128.0	128	3.1340
0.2378	129.0	129	2.9427
0.1278	130.0	130	1.7681
0.1155	131.0	131	2.3328
0.1916	132.0	132	2.3694
0.2023	133.0	133	2.7201
0.112	134.0	134	2.8713
0.0806	135.0	135	2.3181
0.167	136.0	136	2.2217
0.0989	137.0	137	2.0763
0.1576	138.0	138	1.7752
0.0724	139.0	139	3.3534
0.0856	140.0	140	1.3508
0.1947	141.0	141	1.7321
0.2182	142.0	142	3.1232
0.1011	143.0	143	1.8502
0.1927	144.0	144	1.8540
0.2783	145.0	145	3.4478
0.1224	146.0	146	2.1718
0.1484	147.0	147	3.3487
0.1906	148.0	148	2.1033
0.1279	149.0	149	2.6356
0.1071	150.0	150	2.6271

Framework versions

Transformers 4.26.0
Pytorch 1.13.1+cu116
Datasets 2.9.0
Tokenizers 0.13.2

jojoUla
/

bert-large-cased-sigir-LR100-1-cased-150

bert-large-cased-sigir-LR100-1-cased-150

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results