julien-c HF staff commited on
Commit
cac4cc8
β€’
1 Parent(s): 3ac6d2c

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/mrm8488/distilbert-multi-finetuned-for-xqua-on-tydiqa/README.md

Files changed (1) hide show
  1. README.md +82 -0
README.md ADDED
@@ -0,0 +1,82 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: multilingual
3
+ thumbnail:
4
+ ---
5
+
6
+ # DistilBERT multilingual fine-tuned on TydiQA (GoldP task) dataset for multilingual Q&A πŸ˜›πŸŒβ“
7
+
8
+
9
+ ## Details of the language model
10
+
11
+ [distilbert-base-multilingual-cased](https://huggingface.co/distilbert-base-multilingual-cased)
12
+
13
+
14
+ ## Details of the Tydi QA dataset
15
+
16
+ TyDi QA contains 200k human-annotated question-answer pairs in 11 Typologically Diverse languages, written without seeing the answer and without the use of translation, and is designed for the **training and evaluation** of automatic question answering systems. This repository provides evaluation code and a baseline system for the dataset. https://ai.google.com/research/tydiqa
17
+
18
+
19
+ ## Details of the downstream task (Gold Passage or GoldP aka the secondary task)
20
+
21
+ Given a passage that is guaranteed to contain the answer, predict the single contiguous span of characters that answers the question. the gold passage task differs from the [primary task](https://github.com/google-research-datasets/tydiqa/blob/master/README.md#the-tasks) in several ways:
22
+ * only the gold answer passage is provided rather than the entire Wikipedia article;
23
+ * unanswerable questions have been discarded, similar to MLQA and XQuAD;
24
+ * we evaluate with the SQuAD 1.1 metrics like XQuAD; and
25
+ * Thai and Japanese are removed since the lack of whitespace breaks some tools.
26
+
27
+
28
+ ## Model training πŸ’ͺπŸ‹οΈβ€
29
+
30
+ The model was fine-tuned on a Tesla P100 GPU and 25GB of RAM.
31
+ The script is the following:
32
+
33
+ ```python
34
+ python transformers/examples/question-answering/run_squad.py \
35
+ --model_type distilbert \
36
+ --model_name_or_path distilbert-base-multilingual-cased \
37
+ --do_train \
38
+ --do_eval \
39
+ --train_file /path/to/dataset/train.json \
40
+ --predict_file /path/to/dataset/dev.json \
41
+ --per_gpu_train_batch_size 24 \
42
+ --per_gpu_eval_batch_size 24 \
43
+ --learning_rate 3e-5 \
44
+ --num_train_epochs 5 \
45
+ --max_seq_length 384 \
46
+ --doc_stride 128 \
47
+ --output_dir /content/model_output \
48
+ --overwrite_output_dir \
49
+ --save_steps 1000 \
50
+ --threads 400
51
+ ```
52
+
53
+ ## Global Results (dev set) πŸ“
54
+
55
+ | Metric | # Value |
56
+ | --------- | ----------- |
57
+ | **EM** | **63.85** |
58
+ | **F1** | **75.70** |
59
+
60
+ ## Specific Results (per language) πŸŒπŸ“
61
+
62
+ | Language | # Samples | # EM | # F1 |
63
+ | --------- | ----------- |--------| ------ |
64
+ | Arabic | 1314 | 66.66 | 80.02 |
65
+ | Bengali | 180 | 53.09 | 63.50 |
66
+ | English | 654 | 62.42 | 73.12 |
67
+ | Finnish | 1031 | 64.57 | 75.15 |
68
+ | Indonesian| 773 | 67.89 | 79.70 |
69
+ | Korean | 414 | 51.29 | 61.73 |
70
+ | Russian | 1079 | 55.42 | 70.08 |
71
+ | Swahili | 596 | 74.51 | 81.15 |
72
+ | Telegu | 874 | 66.21 | 79.85 |
73
+
74
+
75
+ ## Similar models
76
+
77
+ You can also try [bert-multi-cased-finedtuned-xquad-tydiqa-goldp](https://huggingface.co/mrm8488/bert-multi-cased-finedtuned-xquad-tydiqa-goldp) that achieves **F1 = 82.16** and **EM = 71.06** (And of course better marks per language).
78
+
79
+
80
+ > Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
81
+
82
+ > Made with <span style="color: #e25555;">&hearts;</span> in Spain