Back to all models
question-answering mask_token: [MASK]
Query this model
🔥 This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint
								curl -X POST \
-H "Authorization: Bearer YOUR_ORG_OR_USER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"question": "Where does she live?", "context": "She lives in Berlin."}' \
Share Copied link to clipboard

Monthly model downloads

mrm8488/electra-small-finetuned-squadv2 mrm8488/electra-small-finetuned-squadv2
last 30 days



Contributed by

mrm8488 Manuel Romero
118 models

How to use this model directly from the 🤗/transformers library:

Copy to clipboard
from transformers import AutoTokenizer, AutoModelForQuestionAnswering tokenizer = AutoTokenizer.from_pretrained("mrm8488/electra-small-finetuned-squadv2") model = AutoModelForQuestionAnswering.from_pretrained("mrm8488/electra-small-finetuned-squadv2")

Electra small ⚡ + SQuAD v2 ❓

Electra-small-discriminator fine-tuned on SQUAD v2.0 dataset for Q&A downstream task.

Details of the downstream task (Q&A) - Model 🧠

ELECTRA is a new method for self-supervised language representation learning. It can be used to pre-train transformer networks using relatively little compute. ELECTRA models are trained to distinguish "real" input tokens vs "fake" input tokens generated by another neural network, similar to the discriminator of a GAN. At small scale, ELECTRA achieves strong results even when trained on a single GPU. At large scale, ELECTRA achieves state-of-the-art results on the SQuAD 2.0 dataset.

Details of the downstream task (Q&A) - Dataset 📚

SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.

Model training 🏋️‍

The model was trained on a Tesla P100 GPU and 25GB of RAM with the following command:

python transformers/examples/question-answering/ \
  --model_type electra \
  --model_name_or_path 'google/electra-small-discriminator' \
  --do_eval \
  --do_train \
  --do_lower_case \
  --train_file '/content/dataset/train-v2.0.json' \
  --predict_file '/content/dataset/dev-v2.0.json' \
  --per_gpu_train_batch_size 16 \
  --learning_rate 3e-5 \
  --num_train_epochs 10 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir '/content/output' \
  --overwrite_output_dir \
  --save_steps 1000 \

Test set Results 🧾

Metric # Value
EM 69.71
F1 73.44
Size 50 MB
'exact': 69.71279373368147,
'f1': 73.4439546123672,
'total': 11873,
'HasAns_exact': 69.92240215924427,
'HasAns_f1': 77.39542393937836,
'HasAns_total': 5928,
'NoAns_exact': 69.50378469301934,
'NoAns_f1': 69.50378469301934,
'NoAns_total': 5945,
'best_exact': 69.71279373368147,
'best_exact_thresh': 0.0,
'best_f1': 73.44395461236732,
'best_f1_thresh': 0.0

Model in action 🚀

Fast usage with pipelines:

from transformers import pipeline
QnA_pipeline = pipeline('question-answering', model='mrm8488/electra-base-finetuned-squadv2')
    'context': 'A new strain of flu that has the potential to become a pandemic has been identified in China by scientists.',
    'question': 'What has been discovered by scientists from China ?'
# Output:
{'answer': 'A new strain of flu', 'end': 19, 'score': 0.8650811568752914, 'start': 0}

Created by Manuel Romero/@mrm8488 | LinkedIn

Made with in Spain