resultsGPT

This model is a fine-tuned version of IAMRS23/resultsGPT on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 3e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Perplexity
1.5021	1.1765	100	1.1648	8877.3461
1.0244	2.3529	200	0.9070	9195.3503
0.791	3.5294	300	0.7870	9478.8516
0.6351	4.7059	400	0.7228	9616.5365
0.5269	5.8824	500	0.7003	9649.1851
0.4301	7.0588	600	0.6906	9848.1443
0.3519	8.2353	700	0.6941	9829.4828
0.3066	9.4118	800	0.6943	9911.2424