130000

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	0.92	3	6.2222
No log	1.85	6	6.2146
No log	2.77	9	6.2032
5.9665	4.0	13	6.1877
5.9665	4.92	16	6.1734
5.9665	5.85	19	6.1620
5.8921	6.77	22	6.1539
5.8921	8.0	26	6.1426
5.8921	8.92	29	6.1335
5.8324	9.85	32	6.1277
5.8324	10.77	35	6.1178
5.8324	12.0	39	6.1105
5.8012	12.92	42	6.1059
5.8012	13.85	45	6.0992
5.8012	14.77	48	6.0959
5.7449	16.0	52	6.0910
5.7449	16.92	55	6.0859
5.7449	17.85	58	6.0819
5.7303	18.77	61	6.0767
5.7303	20.0	65	6.0734
5.7303	20.92	68	6.0721
5.6687	21.85	71	6.0694
5.6687	22.77	74	6.0658
5.6687	24.0	78	6.0628
5.6839	24.92	81	6.0627
5.6839	25.85	84	6.0600
5.6839	26.77	87	6.0586
5.6499	28.0	91	6.0572
5.6499	28.92	94	6.0558
5.6499	29.85	97	6.0555
5.6703	30.77	100	6.0545
5.6703	32.0	104	6.0533
5.6703	32.92	107	6.0520
5.6404	33.85	110	6.0518
5.6404	34.77	113	6.0511
5.6404	36.0	117	6.0509
5.6414	36.92	120	6.0504
5.6414	37.85	123	6.0498
5.6414	38.77	126	6.0498
5.6347	40.0	130	6.0496
5.6347	40.92	133	6.0493
5.6347	41.85	136	6.0491
5.6347	42.77	139	6.0491
5.638	44.0	143	6.0491
5.638	44.92	146	6.0491
5.638	45.85	149	6.0491
5.6249	46.15	150	6.0491