Edit model card

baseline

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.9254
  • Exact Match: 0.702

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 400
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: inverse_sqrt
  • lr_scheduler_warmup_steps: 4000
  • training_steps: 20000
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss Exact Match
2.8524 16.0 400 1.7375 0.059
1.422 32.0 800 1.6708 0.11
1.0862 48.0 1200 1.7149 0.094
0.9374 64.0 1600 1.6508 0.159
0.8704 80.0 2000 1.6920 0.112
0.8356 96.0 2400 1.5605 0.16
0.8157 112.0 2800 1.5249 0.188
0.8029 128.0 3200 1.3993 0.25
0.7917 144.0 3600 1.2768 0.312
0.7821 160.0 4000 1.2213 0.397
0.7719 176.0 4400 1.1216 0.432
0.7635 192.0 4800 1.1076 0.458
0.7584 208.0 5200 1.0275 0.567
0.7556 224.0 5600 1.0464 0.552
0.7525 240.0 6000 1.0442 0.56
0.7496 256.0 6400 1.0108 0.581
0.7487 272.0 6800 0.9721 0.61
0.7467 288.0 7200 1.0326 0.567
0.7466 304.0 7600 0.9900 0.572
0.7449 320.0 8000 1.0150 0.604
0.7445 336.0 8400 0.9755 0.603
0.7433 352.0 8800 0.9705 0.645
0.7432 368.0 9200 0.9567 0.663
0.7432 384.0 9600 0.9733 0.68
0.7425 400.0 10000 0.9262 0.67
0.7417 416.0 10400 0.9216 0.673
0.7409 432.0 10800 0.9411 0.681
0.7404 448.0 11200 0.9312 0.674
0.7405 464.0 11600 0.9777 0.585
0.7406 480.0 12000 0.9191 0.683
0.7395 496.0 12400 0.9216 0.643
0.7396 512.0 12800 0.9764 0.645
0.7394 528.0 13200 0.9361 0.644
0.7392 544.0 13600 0.9210 0.67
0.739 560.0 14000 0.9387 0.688
0.7389 576.0 14400 0.9385 0.67
0.7383 592.0 14800 0.9500 0.655
0.7386 608.0 15200 0.9405 0.67
0.7383 624.0 15600 0.9335 0.691
0.738 640.0 16000 0.9079 0.708
0.7379 656.0 16400 0.9027 0.714
0.7376 672.0 16800 0.8969 0.703
0.7372 688.0 17200 0.9169 0.685
0.7375 704.0 17600 0.8895 0.738
0.7376 720.0 18000 0.8951 0.734
0.7371 736.0 18400 0.9408 0.673
0.737 752.0 18800 0.9270 0.693
0.7371 768.0 19200 0.9063 0.71
0.7369 784.0 19600 0.9253 0.678
0.7367 800.0 20000 0.9254 0.702

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.1.0+cu118
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
2
Safetensors
Model size
7.36M params
Tensor type
F32
·