bert-squad2

This model is a fine-tuned version of SpanBERT/spanbert-large-cased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.9506

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 5
  • num_epochs: 3.0

Training results

Training Loss Epoch Step Validation Loss
1.5815 0.0394 100 1.1717
1.1525 0.0788 200 1.1207
1.1214 0.1182 300 1.1106
1.1132 0.1576 400 1.1067
1.1205 0.1970 500 1.1044
3.7006 0.2364 600 5.9506
5.9537 0.2758 700 5.9506
5.9517 0.3152 800 5.9506
5.954 0.3546 900 5.9506
5.9531 0.3940 1000 5.9506
5.9527 0.4334 1100 5.9506
5.9505 0.4728 1200 5.9506
5.9528 0.5122 1300 5.9506
5.9491 0.5516 1400 5.9506
5.9523 0.5910 1500 5.9506
5.951 0.6304 1600 5.9506
5.9526 0.6698 1700 5.9506
5.9499 0.7092 1800 5.9506
5.9513 0.7486 1900 5.9506
5.9496 0.7880 2000 5.9506
5.9528 0.8274 2100 5.9506
5.9538 0.8668 2200 5.9506
5.9535 0.9062 2300 5.9506
5.9535 0.9456 2400 5.9506
5.9521 0.9850 2500 5.9506
5.95 1.0244 2600 5.9506
5.9501 1.0638 2700 5.9506
5.9507 1.1032 2800 5.9506
5.9512 1.1426 2900 5.9506
5.9522 1.1820 3000 5.9506
5.9524 1.2214 3100 5.9506
5.9494 1.2608 3200 5.9506
5.9526 1.3002 3300 5.9506
5.953 1.3396 3400 5.9506
5.9512 1.3790 3500 5.9506
5.9533 1.4184 3600 5.9506
5.9544 1.4578 3700 5.9506
5.9514 1.4972 3800 5.9506
5.9504 1.5366 3900 5.9506
5.9527 1.5760 4000 5.9506
5.9516 1.6154 4100 5.9506
5.9492 1.6548 4200 5.9506
5.9531 1.6942 4300 5.9506
5.951 1.7336 4400 5.9506
5.9526 1.7730 4500 5.9506
5.9517 1.8125 4600 5.9506
5.9518 1.8519 4700 5.9506
5.951 1.8913 4800 5.9506
5.9521 1.9307 4900 5.9506
5.9529 1.9701 5000 5.9506
5.9502 2.0095 5100 5.9506
5.9496 2.0489 5200 5.9506
5.9505 2.0883 5300 5.9506
5.9527 2.1277 5400 5.9506
5.9523 2.1671 5500 5.9506
5.951 2.2065 5600 5.9506
5.9515 2.2459 5700 5.9506
5.9503 2.2853 5800 5.9506
5.9502 2.3247 5900 5.9506
5.9498 2.3641 6000 5.9506
5.9494 2.4035 6100 5.9506
5.9526 2.4429 6200 5.9506
5.9496 2.4823 6300 5.9506
5.9532 2.5217 6400 5.9506
5.9523 2.5611 6500 5.9506
5.9482 2.6005 6600 5.9506
5.9522 2.6399 6700 5.9506
5.9505 2.6793 6800 5.9506
5.9512 2.7187 6900 5.9506
5.9529 2.7581 7000 5.9506
5.9505 2.7975 7100 5.9506
5.9496 2.8369 7200 5.9506
5.9525 2.8763 7300 5.9506
5.9518 2.9157 7400 5.9506
5.9519 2.9551 7500 5.9506
5.9516 2.9945 7600 5.9506

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 3.3.1
  • Tokenizers 0.21.0
Downloads last month
5
Safetensors
Model size
333M params
Tensor type
F32
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for parth0908/bert-squad2

Finetuned
(4)
this model