Back to all models
Model card Files and versions Use in transformers
question-answering mask_token: [MASK]
Query this model
馃敟 This model is currently loaded and running on the Inference API. 鈿狅笍 This model could not be loaded by the inference API. 鈿狅笍 This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

鈿★笍 Upgrade your account to access the Inference API

Share Copied link to clipboard

Contributed by

valhalla Suraj Patil
19 models


This is electra-base-discriminator model finetuned on SQuADv1 dataset for for question answering task.

Model details

As mentioned in the original paper: ELECTRA is a new method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU. At large scale, ELECTRA achieves state-of-the-art results on the SQuAD 2.0 dataset.

Param #Value
layers 12
hidden size 768
num attetion heads 12
on disk size 436MB

Model training

This model was trained on google colab v100 GPU. You can find the fine-tuning colab here Open In Colab.


The results are actually slightly better than given in the paper. In the paper the authors mentioned that electra-base achieves 84.5 EM and 90.8 F1

Metric #Value
EM 85.0520
F1 91.6050

Model in Action 馃殌

from transformers import pipeline

nlp = pipeline('question-answering', model='valhalla/electra-base-discriminator-finetuned_squadv1')
    'question': 'What is the answer to everything ?',
    'context': '42 is the answer to life the universe and everything'
=> {'answer': '42', 'end': 2, 'score': 0.981274963050339, 'start': 0}

Created with 鉂わ笍 by Suraj Patil Github icon Twitter icon