Back to all models

Unable to determine this model’s pipeline type. Check the docs .

Monthly model downloads

mrm8488/spanbert-large-finetuned-squadv2 mrm8488/spanbert-large-finetuned-squadv2
196 downloads
last 30 days

pytorch

tf

Contributed by

mrm8488 Manuel Romero
156 models

How to use this model directly from the πŸ€—/transformers library:

			
Copy to clipboard
from transformers import AutoTokenizer, AutoModel tokenizer = AutoTokenizer.from_pretrained("mrm8488/spanbert-large-finetuned-squadv2") model = AutoModel.from_pretrained("mrm8488/spanbert-large-finetuned-squadv2")

SpanBERT large fine-tuned on SQuAD v2

SpanBERT created by Facebook Research and fine-tuned on SQuAD 2.0 for Q&A downstream task (by them).

Details of SpanBERT

SpanBERT: Improving Pre-training by Representing and Predicting Spans

Details of the downstream task (Q&A) - Dataset πŸ“š 🧐 ❓

SQuAD2.0 combines the 100,000 questions in SQuAD1.1 with over 50,000 unanswerable questions written adversarially by crowdworkers to look similar to answerable ones. To do well on SQuAD2.0, systems must not only answer questions when possible, but also determine when no answer is supported by the paragraph and abstain from answering.

Dataset Split # samples
SQuAD2.0 train 130k
SQuAD2.0 eval 12.3k

Model fine-tuning πŸ‹οΈβ€

You can get the fine-tuning script here

python code/run_squad.py \
  --do_train \
  --do_eval \
  --model spanbert-large-cased \
  --train_file train-v2.0.json \
  --dev_file dev-v2.0.json \
  --train_batch_size 32 \
  --eval_batch_size 32  \
  --learning_rate 2e-5 \
  --num_train_epochs 4 \
  --max_seq_length 512 \
  --doc_stride 128 \
  --eval_metric best_f1 \
  --output_dir squad2_output \
  --version_2_with_negative \
  --fp16

Results Comparison πŸ“

SQuAD 1.1 SQuAD 2.0 Coref TACRED
F1 F1 avg. F1 F1
BERT (base) 88.5* 76.5* 73.1 67.7
SpanBERT (base) 92.4* 83.6* 77.4 68.2
BERT (large) 91.3 83.3 77.1 66.4
SpanBERT (large) 94.6 88.7 (this) 79.6 70.8

Note: The numbers marked as * are evaluated on the development sets becaus those models were not submitted to the official SQuAD leaderboard. All the other numbers are test numbers.

Model in action

Fast usage with pipelines:

from transformers import pipeline

qa_pipeline = pipeline(
    "question-answering",
    model="mrm8488/spanbert-large-finetuned-squadv2",
    tokenizer="SpanBERT/spanbert-large-cased"
)

qa_pipeline({
    'context': "Manuel Romero has been working very hard in the repository hugginface/transformers lately",
    'question': "How has been working Manuel Romero lately?"

})
# Output: {'answer': 'very hard', 'end': 40, 'score': 0.9052708846768347, 'start': 31}

Created by Manuel Romero/@mrm8488

Made with in Spain