electra-distilled-qa

This model is a fine-tuned version of google/electra-small-discriminator on an unknown dataset. It achieves the following results on the evaluation set:

Exact: 68.1799
F1: 71.7591
Total: 11873
Hasans Exact: 70.3441
Hasans F1: 77.5129
Hasans Total: 5928
Noans Exact: 66.0219
Noans F1: 66.0219
Noans Total: 5945
Best Exact: 68.1799
Best Exact Thresh: 0.0
Best F1: 71.7591
Best F1 Thresh: 0.0
Loss: No log

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 4.244429373516175e-05
train_batch_size: 128
eval_batch_size: 128
seed: 33
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 12

Training results

Training Loss	Epoch	Step		Exact	F1	Total	Validation Loss
1.9086	1.0	1030	11873	57.9719	62.0421	5945	No log
1.2919	2.0	2060	11873	66.8155	70.0115	5945	No log
1.1194	3.0	3090	11873	66.8070	70.1755	5945	No log
1.0051	4.0	4120	11873	68.9632	72.4292	5945	No log
0.9191	5.0	5150	11873	67.9609	71.3639	5945	No log
0.8562	6.0	6180	11873	69.5949	72.9986	5945	No log
0.8017	7.0	7210	11873	68.6095	72.2303	5945	No log
0.7554	8.0	8240	11873	67.4556	71.0028	5945	No log
0.7196	9.0	9270	11873	68.0788	71.6887	5945	No log
0.6914	10.0	10300	11873	68.6431	72.1849	5945	No log
0.6687	11.0	11330	11873	68.2473	71.7832	5945	No log
0.6517	12.0	12360	11873	68.1799	71.7591	5945	No log

Framework versions

Transformers 4.28.1
Pytorch 2.2.1+cu121
Datasets 2.19.0
Tokenizers 0.13.3

kasohrab
/

electra-distilled-qa

electra-distilled-qa

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results