bert-base-Maradona

This model is a fine-tuned version of google-bert/bert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.1
num_epochs: 2
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
1.4175	0.0394	10	1.2472
1.2095	0.0787	20	1.1857
1.133	0.1181	30	1.1394
1.1078	0.1575	40	1.1903
1.1261	0.1969	50	1.1080
1.1278	0.2362	60	1.1327
1.0665	0.2756	70	1.0953
1.0581	0.3150	80	1.1101
1.0518	0.3543	90	1.1255
1.0643	0.3937	100	1.0626
1.0804	0.4331	110	1.0686
1.1146	0.4724	120	1.0215
1.1015	0.5118	130	1.0475
1.0134	0.5512	140	1.0388
0.9956	0.5906	150	1.0563
1.0683	0.6299	160	1.0259
0.9713	0.6693	170	0.9933
1.0103	0.7087	180	1.0096
1.0062	0.7480	190	0.9940
0.9612	0.7874	200	1.0548
1.1625	0.8268	210	1.0181
1.0502	0.8661	220	0.9747
0.9971	0.9055	230	0.9787
0.9128	0.9449	240	0.9965
1.0445	0.9843	250	0.9716
0.9842	1.0236	260	0.9758
0.8422	1.0630	270	1.0168
0.8901	1.1024	280	0.9682
0.9104	1.1417	290	0.9458
0.7868	1.1811	300	0.9196
0.8731	1.2205	310	0.9240
0.7612	1.2598	320	0.9240
0.9062	1.2992	330	0.9240
0.7988	1.3386	340	0.9268
0.7868	1.3780	350	0.9156
0.7878	1.4173	360	0.9161
0.7913	1.4567	370	0.9154
0.8082	1.4961	380	0.9064
0.7385	1.5354	390	0.9012
0.6725	1.5748	400	0.9090
0.7143	1.6142	410	0.9113
0.791	1.6535	420	0.9122
0.7273	1.6929	430	0.9085
0.7976	1.7323	440	0.9028
0.6353	1.7717	450	0.9047
0.8573	1.8110	460	0.8993
0.754	1.8504	470	0.8951
0.7464	1.8898	480	0.8930
0.7193	1.9291	490	0.8924
0.8594	1.9685	500	0.8920