weakit-v commited on
Commit
c221986
1 Parent(s): 65cfc3c

Initial Dump

Browse files
Files changed (7) hide show
  1. README.md +252 -0
  2. config.json +29 -0
  3. merges.txt +0 -0
  4. special_tokens_map.json +15 -0
  5. tokenizer.json +0 -0
  6. tokenizer_config.json +57 -0
  7. vocab.json +0 -0
README.md CHANGED
@@ -1,3 +1,255 @@
1
  ---
 
2
  license: cc-by-4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
  license: cc-by-4.0
4
+ datasets:
5
+ - squad_v2
6
+ model-index:
7
+ - name: deepset/tinyroberta-squad2
8
+ results:
9
+ - task:
10
+ type: question-answering
11
+ name: Question Answering
12
+ dataset:
13
+ name: squad_v2
14
+ type: squad_v2
15
+ config: squad_v2
16
+ split: validation
17
+ metrics:
18
+ - type: exact_match
19
+ value: 78.8627
20
+ name: Exact Match
21
+ verified: true
22
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNDNlZDU4ODAxMzY5NGFiMTMyZmQ1M2ZhZjMyODA1NmFlOGMxNzYxNTA4OGE5YTBkZWViZjBkNGQ2ZmMxZjVlMCIsInZlcnNpb24iOjF9.Wgu599r6TvgMLTrHlLMVAbUtKD_3b70iJ5QSeDQ-bRfUsVk6Sz9OsJCp47riHJVlmSYzcDj_z_3jTcUjCFFXBg
23
+ - type: f1
24
+ value: 82.0355
25
+ name: F1
26
+ verified: true
27
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiOTFkMzEzMWNiZDRhMGZlODhkYzcwZTZiMDFjZDg2YjllZmUzYWM5NTgwNGQ2NGYyMDk2ZGQwN2JmMTE5NTc3YiIsInZlcnNpb24iOjF9.ChgaYpuRHd5WeDFjtiAHUyczxtoOD_M5WR8834jtbf7wXhdGOnZKdZ1KclmhoI5NuAGc1NptX-G0zQ5FTHEcBA
28
+ - task:
29
+ type: question-answering
30
+ name: Question Answering
31
+ dataset:
32
+ name: squad
33
+ type: squad
34
+ config: plain_text
35
+ split: validation
36
+ metrics:
37
+ - type: exact_match
38
+ value: 83.860
39
+ name: Exact Match
40
+ - type: f1
41
+ value: 90.752
42
+ name: F1
43
+ - task:
44
+ type: question-answering
45
+ name: Question Answering
46
+ dataset:
47
+ name: adversarial_qa
48
+ type: adversarial_qa
49
+ config: adversarialQA
50
+ split: validation
51
+ metrics:
52
+ - type: exact_match
53
+ value: 25.967
54
+ name: Exact Match
55
+ - type: f1
56
+ value: 37.006
57
+ name: F1
58
+ - task:
59
+ type: question-answering
60
+ name: Question Answering
61
+ dataset:
62
+ name: squad_adversarial
63
+ type: squad_adversarial
64
+ config: AddOneSent
65
+ split: validation
66
+ metrics:
67
+ - type: exact_match
68
+ value: 76.329
69
+ name: Exact Match
70
+ - type: f1
71
+ value: 83.292
72
+ name: F1
73
+ - task:
74
+ type: question-answering
75
+ name: Question Answering
76
+ dataset:
77
+ name: squadshifts amazon
78
+ type: squadshifts
79
+ config: amazon
80
+ split: test
81
+ metrics:
82
+ - type: exact_match
83
+ value: 63.915
84
+ name: Exact Match
85
+ - type: f1
86
+ value: 78.395
87
+ name: F1
88
+ - task:
89
+ type: question-answering
90
+ name: Question Answering
91
+ dataset:
92
+ name: squadshifts new_wiki
93
+ type: squadshifts
94
+ config: new_wiki
95
+ split: test
96
+ metrics:
97
+ - type: exact_match
98
+ value: 80.297
99
+ name: Exact Match
100
+ - type: f1
101
+ value: 89.808
102
+ name: F1
103
+ - task:
104
+ type: question-answering
105
+ name: Question Answering
106
+ dataset:
107
+ name: squadshifts nyt
108
+ type: squadshifts
109
+ config: nyt
110
+ split: test
111
+ metrics:
112
+ - type: exact_match
113
+ value: 80.149
114
+ name: Exact Match
115
+ - type: f1
116
+ value: 88.321
117
+ name: F1
118
+ - task:
119
+ type: question-answering
120
+ name: Question Answering
121
+ dataset:
122
+ name: squadshifts reddit
123
+ type: squadshifts
124
+ config: reddit
125
+ split: test
126
+ metrics:
127
+ - type: exact_match
128
+ value: 66.959
129
+ name: Exact Match
130
+ - type: f1
131
+ value: 79.300
132
+ name: F1
133
  ---
134
+
135
+ **This repo contains the model exported to ONNX weights.**
136
+ **Everything is provided as-is.**
137
+
138
+ ---
139
+
140
+ # tinyroberta-squad2
141
+
142
+ This is the *distilled* version of the [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2) model. This model has a comparable prediction quality and runs at twice the speed of the base model.
143
+
144
+ ## Overview
145
+ **Language model:** tinyroberta-squad2
146
+ **Language:** English
147
+ **Downstream-task:** Extractive QA
148
+ **Training data:** SQuAD 2.0
149
+ **Eval data:** SQuAD 2.0
150
+ **Code:** See [an example QA pipeline on Haystack](https://haystack.deepset.ai/tutorials/first-qa-system)
151
+ **Infrastructure**: 4x Tesla v100
152
+
153
+ ## Hyperparameters
154
+
155
+ ```
156
+ batch_size = 96
157
+ n_epochs = 4
158
+ base_LM_model = "deepset/tinyroberta-squad2-step1"
159
+ max_seq_len = 384
160
+ learning_rate = 3e-5
161
+ lr_schedule = LinearWarmup
162
+ warmup_proportion = 0.2
163
+ doc_stride = 128
164
+ max_query_length = 64
165
+ distillation_loss_weight = 0.75
166
+ temperature = 1.5
167
+ teacher = "deepset/robert-large-squad2"
168
+ ```
169
+
170
+ ## Distillation
171
+ This model was distilled using the TinyBERT approach described in [this paper](https://arxiv.org/pdf/1909.10351.pdf) and implemented in [haystack](https://github.com/deepset-ai/haystack).
172
+ Firstly, we have performed intermediate layer distillation with roberta-base as the teacher which resulted in [deepset/tinyroberta-6l-768d](https://huggingface.co/deepset/tinyroberta-6l-768d).
173
+ Secondly, we have performed task-specific distillation with [deepset/roberta-base-squad2](https://huggingface.co/deepset/roberta-base-squad2) as the teacher for further intermediate layer distillation on an augmented version of SQuADv2 and then with [deepset/roberta-large-squad2](https://huggingface.co/deepset/roberta-large-squad2) as the teacher for prediction layer distillation.
174
+
175
+ ## Usage
176
+
177
+ ### In Haystack
178
+ Haystack is an NLP framework by deepset. You can use this model in a Haystack pipeline to do question answering at scale (over many documents). To load the model in [Haystack](https://github.com/deepset-ai/haystack/):
179
+
180
+ ```python
181
+ reader = FARMReader(model_name_or_path="deepset/tinyroberta-squad2")
182
+ # or
183
+ reader = TransformersReader(model_name_or_path="deepset/tinyroberta-squad2")
184
+ ```
185
+
186
+ ### In Transformers
187
+ ```python
188
+ from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
189
+
190
+ model_name = "deepset/tinyroberta-squad2"
191
+
192
+ # a) Get predictions
193
+ nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
194
+ QA_input = {
195
+ 'question': 'Why is model conversion important?',
196
+ 'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
197
+ }
198
+ res = nlp(QA_input)
199
+
200
+ # b) Load model & tokenizer
201
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
202
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
203
+ ```
204
+
205
+ ## Performance
206
+ Evaluated on the SQuAD 2.0 dev set with the [official eval script](https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/).
207
+
208
+ ```
209
+ "exact": 78.69114798281817,
210
+ "f1": 81.9198998536977,
211
+
212
+ "total": 11873,
213
+ "HasAns_exact": 76.19770580296895,
214
+ "HasAns_f1": 82.66446878592329,
215
+ "HasAns_total": 5928,
216
+ "NoAns_exact": 81.17746005046257,
217
+ "NoAns_f1": 81.17746005046257,
218
+ "NoAns_total": 5945
219
+ ```
220
+
221
+ ## Authors
222
+ **Branden Chan:** branden.chan@deepset.ai
223
+ **Timo Möller:** timo.moeller@deepset.ai
224
+ **Malte Pietsch:** malte.pietsch@deepset.ai
225
+ **Tanay Soni:** tanay.soni@deepset.ai
226
+ **Michel Bartels:** michel.bartels@deepset.ai
227
+
228
+ ## About us
229
+
230
+ <div class="grid lg:grid-cols-2 gap-x-4 gap-y-3">
231
+ <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
232
+ <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/deepset-logo-colored.png" class="w-40"/>
233
+ </div>
234
+ <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
235
+ <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/haystack-logo-colored.png" class="w-40"/>
236
+ </div>
237
+ </div>
238
+
239
+ [deepset](http://deepset.ai/) is the company behind the open-source NLP framework [Haystack](https://haystack.deepset.ai/) which is designed to help you build production ready NLP systems that use: Question answering, summarization, ranking etc.
240
+
241
+
242
+ Some of our other work:
243
+ - [roberta-base-squad2]([https://huggingface.co/deepset/roberta-base-squad2)
244
+ - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert)
245
+ - [GermanQuAD and GermanDPR datasets and models (aka "gelectra-base-germanquad", "gbert-base-germandpr")](https://deepset.ai/germanquad)
246
+
247
+ ## Get in touch and join the Haystack community
248
+
249
+ <p>For more info on Haystack, visit our <strong><a href="https://github.com/deepset-ai/haystack">GitHub</a></strong> repo and <strong><a href="https://docs.haystack.deepset.ai">Documentation</a></strong>.
250
+
251
+ We also have a <strong><a class="h-7" href="https://haystack.deepset.ai/community/join">Discord community open to everyone!</a></strong></p>
252
+
253
+ [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Discord](https://haystack.deepset.ai/community) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
254
+
255
+ By the way: [we're hiring!](http://www.deepset.ai/jobs)
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "deepset/tinyroberta-squad2",
3
+ "architectures": [
4
+ "RobertaForQuestionAnswering"
5
+ ],
6
+ "attention_probs_dropout_prob": 0.1,
7
+ "bos_token_id": 0,
8
+ "classifier_dropout": null,
9
+ "eos_token_id": 2,
10
+ "gradient_checkpointing": false,
11
+ "hidden_act": "gelu",
12
+ "hidden_dropout_prob": 0.1,
13
+ "hidden_size": 768,
14
+ "initializer_range": 0.02,
15
+ "intermediate_size": 3072,
16
+ "language": "english",
17
+ "layer_norm_eps": 1e-05,
18
+ "max_position_embeddings": 514,
19
+ "model_type": "roberta",
20
+ "name": "Roberta",
21
+ "num_attention_heads": 12,
22
+ "num_hidden_layers": 6,
23
+ "pad_token_id": 1,
24
+ "position_embedding_type": "absolute",
25
+ "transformers_version": "4.35.2",
26
+ "type_vocab_size": 1,
27
+ "use_cache": true,
28
+ "vocab_size": 50265
29
+ }
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": {
6
+ "content": "<mask>",
7
+ "lstrip": true,
8
+ "normalized": false,
9
+ "rstrip": false,
10
+ "single_word": false
11
+ },
12
+ "pad_token": "<pad>",
13
+ "sep_token": "</s>",
14
+ "unk_token": "<unk>"
15
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": false,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "<pad>",
14
+ "lstrip": false,
15
+ "normalized": false,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "</s>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<unk>",
30
+ "lstrip": false,
31
+ "normalized": false,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "50264": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "bos_token": "<s>",
46
+ "clean_up_tokenization_spaces": true,
47
+ "cls_token": "<s>",
48
+ "eos_token": "</s>",
49
+ "errors": "replace",
50
+ "mask_token": "<mask>",
51
+ "model_max_length": 512,
52
+ "pad_token": "<pad>",
53
+ "sep_token": "</s>",
54
+ "tokenizer_class": "RobertaTokenizer",
55
+ "trim_offsets": true,
56
+ "unk_token": "<unk>"
57
+ }
vocab.json ADDED
The diff for this file is too large to render. See raw diff