Back to all models
question-answering mask_token: [MASK]
Context
Query this model
🔥 This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚡️ Upgrade your account to access the Inference API

							$
							curl -X POST \
-H "Authorization: Bearer YOUR_ORG_OR_USER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"question": "Where does she live?", "context": "She lives in Berlin."}' \
https://api-inference.huggingface.co/models/elgeish/cs224n-squad2.0-distilbert-base-uncased
Share Copied link to clipboard

Monthly model downloads

elgeish/cs224n-squad2.0-distilbert-base-uncased elgeish/cs224n-squad2.0-distilbert-base-uncased
17 downloads
last 30 days

pytorch

tf

Contributed by

elgeish Mohamed El-Geish
5 models

How to use this model directly from the 🤗/transformers library:

			
Copy to clipboard
from transformers import AutoTokenizer, AutoModelForQuestionAnswering tokenizer = AutoTokenizer.from_pretrained("elgeish/cs224n-squad2.0-distilbert-base-uncased") model = AutoModelForQuestionAnswering.from_pretrained("elgeish/cs224n-squad2.0-distilbert-base-uncased")

CS224n SQuAD2.0 Project Dataset

The goal of this model is to save CS224n students GPU time when establising baselines to beat for the Default Final Project. The training set used to fine-tune this model is the same as the official one; however, evaluation and model selection were performed using roughly half of the official dev set, 6078 examples, picked at random. The data files can be found at https://github.com/elgeish/squad/tree/master/data — this is the Winter 2020 version. Given that the official SQuAD2.0 dev set contains the project's test set, students must make sure not to use the official SQuAD2.0 dev set in any way — including the use of models fine-tuned on the official SQuAD2.0, since they used the official SQuAD2.0 dev set for model selection.

Results

{
  "exact": 65.16946363935504,
  "f1": 67.87348075352251,
  "total": 6078,
  "HasAns_exact": 69.51890034364261,
  "HasAns_f1": 75.16667217179045,
  "HasAns_total": 2910,
  "NoAns_exact": 61.17424242424242,
  "NoAns_f1": 61.17424242424242,
  "NoAns_total": 3168,
  "best_exact": 65.16946363935504,
  "best_exact_thresh": 0.0,
  "best_f1": 67.87348075352243,
  "best_f1_thresh": 0.0
}

Notable Arguments

{
  "do_lower_case": true,
  "doc_stride": 128,
  "fp16": false,
  "fp16_opt_level": "O1",
  "gradient_accumulation_steps": 24,
  "learning_rate": 3e-05,
  "max_answer_length": 30,
  "max_grad_norm": 1,
  "max_query_length": 64,
  "max_seq_length": 384,
  "model_name_or_path": "distilbert-base-uncased-distilled-squad",
  "model_type": "distilbert",
  "num_train_epochs": 4,
  "per_gpu_train_batch_size": 32,
  "save_steps": 5000,
  "seed": 42,
  "train_batch_size": 32,
  "version_2_with_negative": true,
  "warmup_steps": 0,
  "weight_decay": 0
}

Environment Setup

{
  "transformers": "2.5.1",
  "pytorch": "1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0",
  "python": "3.6.5=hc3d631a_2",
  "os": "Linux 4.15.0-1060-aws #62-Ubuntu SMP Tue Feb 11 21:23:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux",
  "gpu": "Tesla V100-SXM2-16GB"
}

How to Cite

@misc{elgeish2020gestalt,
  title={Gestalt: a Stacking Ensemble for SQuAD2.0},
  author={Mohamed El-Geish},
  journal={arXiv e-prints},
  archivePrefix={arXiv},
  eprint={2004.07067},
  year={2020},
}