Back to all models
question-answering mask_token: <mask>
Query this model
馃敟 This model is currently loaded and running on the Inference API. 鈿狅笍 This model could not be loaded by the inference API. 鈿狅笍 This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

鈿★笍 Upgrade your account to access the Inference API

Share Copied link to clipboard

Monthly model downloads

csarron/roberta-base-squad-v1 csarron/roberta-base-squad-v1
last 30 days



Contributed by

csarron Qingqing Cao
5 models

How to use this model directly from the 馃/transformers library:

Copy to clipboard
from transformers import AutoTokenizer, AutoModelForQuestionAnswering tokenizer = AutoTokenizer.from_pretrained("csarron/roberta-base-squad-v1") model = AutoModelForQuestionAnswering.from_pretrained("csarron/roberta-base-squad-v1")

RoBERTa-base fine-tuned on SQuAD v1

This model was fine-tuned from the HuggingFace RoBERTa base checkpoint on SQuAD1.1. This model is case-sensitive: it makes a difference between english and English.


Dataset Split # samples
SQuAD1.1 train 96.8K
SQuAD1.1 eval 11.8k


  • Python: 3.7.5

  • Machine specs:

    CPU: Intel(R) Core(TM) i7-6800K CPU @ 3.40GHz

    Memory: 32 GiB

    GPUs: 2 GeForce GTX 1070, each with 8GiB memory

    GPU driver: 418.87.01, CUDA: 10.1

  • script:

    # after install
    cd examples/question-answering
    mkdir -p data
    wget -O data/train-v1.1.json
    wget -O data/dev-v1.1.json
    python \
      --model_type roberta \
      --model_name_or_path roberta-base \
      --do_train \
      --do_eval \
      --train_file train-v1.1.json \
      --predict_file dev-v1.1.json \
      --per_gpu_train_batch_size 12 \
      --per_gpu_eval_batch_size 16 \
      --learning_rate 3e-5 \
      --num_train_epochs 2.0 \
      --max_seq_length 320 \
      --doc_stride 128 \
      --data_dir data \
      --output_dir data/roberta-base-squad-v1 2>&1 | tee train-roberta-base-squad-v1.log

It took about 2 hours to finish.


Model size: 477M

Metric # Value
EM 83.0
F1 90.4

Note that the above results didn't involve any hyperparameter search.

Example Usage

from transformers import pipeline

qa_pipeline = pipeline(

predictions = qa_pipeline({
    'context': "The game was played on February 7, 2016 at Levi's Stadium in the San Francisco Bay Area at Santa Clara, California.",
    'question': "What day was the game played on?"

# output:
# {'score': 0.8625259399414062, 'start': 23, 'end': 39, 'answer': 'February 7, 2016'}

Created by Qingqing Cao | GitHub | Twitter

Made with 鉂わ笍 in New York.