Edit model card

electra-distilled-qa

This model is a fine-tuned version of google/electra-small-discriminator on an unknown dataset. It achieves the following results on the evaluation set:

  • Exact: 68.1799
  • F1: 71.7591
  • Total: 11873
  • Hasans Exact: 70.3441
  • Hasans F1: 77.5129
  • Hasans Total: 5928
  • Noans Exact: 66.0219
  • Noans F1: 66.0219
  • Noans Total: 5945
  • Best Exact: 68.1799
  • Best Exact Thresh: 0.0
  • Best F1: 71.7591
  • Best F1 Thresh: 0.0
  • Loss: No log

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4.244429373516175e-05
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 33
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 12

Training results

Training Loss Epoch Step Exact F1 Total Exact Thresh F1 Thresh Validation Loss
1.9086 1.0 1030 11873 57.9719 62.0421 5945 0.0 0.0 No log
1.2919 2.0 2060 11873 66.8155 70.0115 5945 0.0 0.0 No log
1.1194 3.0 3090 11873 66.8070 70.1755 5945 0.0 0.0 No log
1.0051 4.0 4120 11873 68.9632 72.4292 5945 0.0 0.0 No log
0.9191 5.0 5150 11873 67.9609 71.3639 5945 0.0 0.0 No log
0.8562 6.0 6180 11873 69.5949 72.9986 5945 0.0 0.0 No log
0.8017 7.0 7210 11873 68.6095 72.2303 5945 0.0 0.0 No log
0.7554 8.0 8240 11873 67.4556 71.0028 5945 0.0 0.0 No log
0.7196 9.0 9270 11873 68.0788 71.6887 5945 0.0 0.0 No log
0.6914 10.0 10300 11873 68.6431 72.1849 5945 0.0 0.0 No log
0.6687 11.0 11330 11873 68.2473 71.7832 5945 0.0 0.0 No log
0.6517 12.0 12360 11873 68.1799 71.7591 5945 0.0 0.0 No log

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.2.1+cu121
  • Datasets 2.19.0
  • Tokenizers 0.13.3
Downloads last month
18