Back to all models
question-answering mask_token: [MASK]
Query this model
πŸ”₯ This model is currently loaded and running on the Inference API. ⚠️ This model could not be loaded by the inference API. ⚠️ This model can be loaded on the Inference API on-demand.
JSON Output
API endpoint  

⚑️ Upgrade your account to access the Inference API

							curl -X POST \
-H "Authorization: Bearer YOUR_ORG_OR_USER_API_TOKEN" \
-H "Content-Type: application/json" \
-d '{"question": "Where does she live?", "context": "She lives in Berlin."}' \
Share Copied link to clipboard

Monthly model downloads

mrm8488/distilbert-multi-finetuned-for-xqua-on-tydiqa mrm8488/distilbert-multi-finetuned-for-xqua-on-tydiqa
last 30 days



Contributed by

mrm8488 Manuel Romero
146 models

How to use this model directly from the πŸ€—/transformers library:

Copy to clipboard
from transformers import AutoTokenizer, AutoModelForQuestionAnswering tokenizer = AutoTokenizer.from_pretrained("mrm8488/distilbert-multi-finetuned-for-xqua-on-tydiqa") model = AutoModelForQuestionAnswering.from_pretrained("mrm8488/distilbert-multi-finetuned-for-xqua-on-tydiqa")

DistilBERT multilingual fine-tuned on TydiQA (GoldP task) dataset for multilingual Q&A πŸ˜›πŸŒβ“

Details of the language model


Details of the Tydi QA dataset

TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the training and evaluation of automatic question answering systems. This repository provides evaluation code and a baseline system for the dataset.

Details of the downstream task (Gold Passage or GoldP aka the secondary task)

Given a passage that is guaranteed to contain the answer, predict the single contiguous span of characters that answers the question. the gold passage task differs from the primary task in several ways:

  • only the gold answer passage is provided rather than the entire Wikipedia article;
  • unanswerable questions have been discarded, similar to MLQA and XQuAD;
  • we evaluate with the SQuAD 1.1 metrics like XQuAD; and
  • Thai and Japanese are removed since the lack of whitespace breaks some tools.

Model training πŸ’ͺπŸ‹οΈβ€

The model was fine-tuned on a Tesla P100 GPU and 25GB of RAM. The script is the following:

python transformers/examples/question-answering/ \
  --model_type distilbert \
  --model_name_or_path distilbert-base-multilingual-cased \
  --do_train \
  --do_eval \
  --train_file /path/to/dataset/train.json \
  --predict_file /path/to/dataset/dev.json \
  --per_gpu_train_batch_size 24 \
  --per_gpu_eval_batch_size 24 \
  --learning_rate 3e-5 \
  --num_train_epochs 5 \
  --max_seq_length 384 \
  --doc_stride 128 \
  --output_dir /content/model_output \
  --overwrite_output_dir \
  --save_steps 1000 \
  --threads 400

Global Results (dev set) πŸ“

Metric # Value
EM 63.85
F1 75.70

Specific Results (per language) πŸŒπŸ“

Language # Samples # EM # F1
Arabic 1314 66.66 80.02
Bengali 180 53.09 63.50
English 654 62.42 73.12
Finnish 1031 64.57 75.15
Indonesian 773 67.89 79.70
Korean 414 51.29 61.73
Russian 1079 55.42 70.08
Swahili 596 74.51 81.15
Telegu 874 66.21 79.85

Similar models

You can also try bert-multi-cased-finedtuned-xquad-tydiqa-goldp that achieves F1 = 82.16 and EM = 71.06 (And of course better marks per language).

Created by Manuel Romero/@mrm8488

Made with in Spain