julien-c HF staff commited on
Commit
0122f46
1 Parent(s): 9f443b0

Migrate model card from transformers-repo

Browse files

Read announcement at https://discuss.huggingface.co/t/announcement-all-model-cards-will-be-migrated-to-hf-co-model-repos/2755
Original file history: https://github.com/huggingface/transformers/commits/master/model_cards/mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es/README.md

Files changed (1) hide show
  1. README.md +141 -0
README.md ADDED
@@ -0,0 +1,141 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: es
3
+ thumbnail: https://i.imgur.com/jgBdimh.png
4
+ ---
5
+
6
+ # BETO (Spanish BERT) + Spanish SQuAD2.0 + distillation using 'bert-base-multilingual-cased' as teacher
7
+
8
+ This model is a fine-tuned on [SQuAD-es-v2.0](https://github.com/ccasimiro88/TranslateAlignRetrieve) and **distilled** version of [BETO](https://github.com/dccuchile/beto) for **Q&A**.
9
+
10
+ Distillation makes the model **smaller, faster, cheaper and lighter** than [bert-base-spanish-wwm-cased-finetuned-spa-squad2-es](https://github.com/huggingface/transformers/blob/master/model_cards/mrm8488/bert-base-spanish-wwm-cased-finetuned-spa-squad2-es/README.md)
11
+
12
+ This model was fine-tuned on the same dataset but using **distillation** during the process as mentioned above (and one more train epoch).
13
+
14
+ The **teacher model** for the distillation was `bert-base-multilingual-cased`. It is the same teacher used for `distilbert-base-multilingual-cased` AKA [**DistilmBERT**](https://github.com/huggingface/transformers/tree/master/examples/distillation) (on average is twice as fast as **mBERT-base**).
15
+
16
+ ## Details of the downstream task (Q&A) - Dataset
17
+
18
+ <details>
19
+
20
+ [SQuAD-es-v2.0](https://github.com/ccasimiro88/TranslateAlignRetrieve)
21
+
22
+ | Dataset | # Q&A |
23
+ | ----------------------- | ----- |
24
+ | SQuAD2.0 Train | 130 K |
25
+ | SQuAD2.0-es-v2.0 | 111 K |
26
+ | SQuAD2.0 Dev | 12 K |
27
+ | SQuAD-es-v2.0-small Dev | 69 K |
28
+
29
+ </details>
30
+
31
+ ## Model training
32
+
33
+ The model was trained on a Tesla P100 GPU and 25GB of RAM with the following command:
34
+
35
+ ```bash
36
+ !export SQUAD_DIR=/path/to/squad-v2_spanish \
37
+ && python transformers/examples/distillation/run_squad_w_distillation.py \
38
+ --model_type bert \
39
+ --model_name_or_path dccuchile/bert-base-spanish-wwm-cased \
40
+ --teacher_type bert \
41
+ --teacher_name_or_path bert-base-multilingual-cased \
42
+ --do_train \
43
+ --do_eval \
44
+ --do_lower_case \
45
+ --train_file $SQUAD_DIR/train-v2.json \
46
+ --predict_file $SQUAD_DIR/dev-v2.json \
47
+ --per_gpu_train_batch_size 12 \
48
+ --learning_rate 3e-5 \
49
+ --num_train_epochs 5.0 \
50
+ --max_seq_length 384 \
51
+ --doc_stride 128 \
52
+ --output_dir /content/model_output \
53
+ --save_steps 5000 \
54
+ --threads 4 \
55
+ --version_2_with_negative
56
+ ```
57
+
58
+ ## Results:
59
+
60
+ | Metric | # Value |
61
+ | --------- | ----------- |
62
+ | **Exact** | **90.77**48 |
63
+ | **F1** | **94.94**71 |
64
+
65
+ ```json
66
+ {
67
+ "exact": 90.77483309730933,
68
+ "f1": 94.94714391266254,
69
+ "total": 69202,
70
+ "HasAns_exact": 86.60850599781898,
71
+ "HasAns_f1": 92.90582885592328,
72
+ "HasAns_total": 45850,
73
+ "NoAns_exact": 98.95512161699212,
74
+ "NoAns_f1": 98.95512161699212,
75
+ "NoAns_total": 23352,
76
+ "best_exact": 90.77483309730933,
77
+ "best_exact_thresh": 0.0,
78
+ "best_f1": 94.94714391266305,
79
+ "best_f1_thresh": 0.0
80
+ }
81
+ ```
82
+
83
+ ## Comparison:
84
+
85
+ | Model | f1 score |
86
+ | :-------------------------------------------------------------: | :-------: |
87
+ | bert-base-spanish-wwm-cased-finetuned-spa-squad2-es | 86.07 |
88
+ | **distill**-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es | **94.94** |
89
+
90
+ So, yes, this version is even more accurate.
91
+
92
+ ### Model in action
93
+
94
+ Fast usage with **pipelines**:
95
+
96
+ ```python
97
+ from transformers import *
98
+
99
+ # Important!: By now the QA pipeline is not compatible with fast tokenizer, but they are working on it. So that pass the object to the tokenizer {"use_fast": False} as in the following example:
100
+
101
+ nlp = pipeline(
102
+ 'question-answering',
103
+ model='mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es',
104
+ tokenizer=(
105
+ 'mrm8488/distill-bert-base-spanish-wwm-cased-finetuned-spa-squad2-es',
106
+ {"use_fast": False}
107
+ )
108
+ )
109
+
110
+ nlp(
111
+ {
112
+ 'question': '¿Para qué lenguaje está trabajando?',
113
+ 'context': 'Manuel Romero está colaborando activamente con huggingface/transformers ' +
114
+ 'para traer el poder de las últimas técnicas de procesamiento de lenguaje natural al idioma español'
115
+ }
116
+ )
117
+ # Output: {'answer': 'español', 'end': 169, 'score': 0.67530957344621, 'start': 163}
118
+ ```
119
+
120
+ Play with this model and ```pipelines``` in a Colab:
121
+
122
+ <a href="https://colab.research.google.com/github/mrm8488/shared_colab_notebooks/blob/master/Using_Spanish_BERT_fine_tuned_for_Q%26A_pipelines.ipynb" target="_parent"><img src="https://camo.githubusercontent.com/52feade06f2fecbf006889a904d221e6a730c194/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667" alt="Open In Colab" data-canonical-src="https://colab.research.google.com/assets/colab-badge.svg"></a>
123
+
124
+ <details>
125
+
126
+ 1. Set the context and ask some questions:
127
+
128
+ ![Set context and questions](https://media.giphy.com/media/mCIaBpfN0LQcuzkA2F/giphy.gif)
129
+
130
+ 2. Run predictions:
131
+
132
+ ![Run the model](https://media.giphy.com/media/WT453aptcbCP7hxWTZ/giphy.gif)
133
+ </details>
134
+
135
+ More about ``` Huggingface pipelines```? check this Colab out:
136
+
137
+ <a href="https://colab.research.google.com/github/mrm8488/shared_colab_notebooks/blob/master/Huggingface_pipelines_demo.ipynb" target="_parent"><img src="https://camo.githubusercontent.com/52feade06f2fecbf006889a904d221e6a730c194/68747470733a2f2f636f6c61622e72657365617263682e676f6f676c652e636f6d2f6173736574732f636f6c61622d62616467652e737667" alt="Open In Colab" data-canonical-src="https://colab.research.google.com/assets/colab-badge.svg"></a>
138
+
139
+ > Created by [Manuel Romero/@mrm8488](https://twitter.com/mrm8488)
140
+
141
+ > Made with <span style="color: #e25555;">&hearts;</span> in Spain