RichardErkhov commited on
Commit
8a9403e
1 Parent(s): b07aee9

uploaded readme

Browse files
Files changed (1) hide show
  1. README.md +216 -0
README.md ADDED
@@ -0,0 +1,216 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Quantization made by Richard Erkhov.
2
+
3
+ [Github](https://github.com/RichardErkhov)
4
+
5
+ [Discord](https://discord.gg/pvy7H8DZMG)
6
+
7
+ [Request more models](https://github.com/RichardErkhov/quant_request)
8
+
9
+
10
+ roberta-base-squad2-distilled - bnb 4bits
11
+ - Model creator: https://huggingface.co/deepset/
12
+ - Original model: https://huggingface.co/deepset/roberta-base-squad2-distilled/
13
+
14
+
15
+
16
+
17
+ Original model description:
18
+ ---
19
+ language: en
20
+ license: mit
21
+ tags:
22
+ - exbert
23
+ datasets:
24
+ - squad_v2
25
+ thumbnail: https://thumb.tildacdn.com/tild3433-3637-4830-a533-353833613061/-/resize/720x/-/format/webp/germanquad.jpg
26
+ model-index:
27
+ - name: deepset/roberta-base-squad2-distilled
28
+ results:
29
+ - task:
30
+ type: question-answering
31
+ name: Question Answering
32
+ dataset:
33
+ name: squad_v2
34
+ type: squad_v2
35
+ config: squad_v2
36
+ split: validation
37
+ metrics:
38
+ - type: exact_match
39
+ value: 80.8593
40
+ name: Exact Match
41
+ verified: true
42
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzVjNzkxNmNiNDkzNzdiYjJjZGM3ZTViMGJhOGM2ZjFmYjg1MjYxMDM2YzM5NWMwNDIyYzNlN2QwNGYyNDMzZSIsInZlcnNpb24iOjF9.Rgww8tf8D7nF2dh2U_DMrFzmp87k8s7RFibrDXSvQyA66PGWXwjlsd1552lzjHnNV5hvHUM1-h3PTuY_5p64BA
43
+ - type: f1
44
+ value: 84.0104
45
+ name: F1
46
+ verified: true
47
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTAyZDViNWYzNjA4OWQ5MzgyYmQ2ZDlhNWRhMTIzYTYxYzViMmI4NWE4ZGU5MzVhZTAwNTRlZmRlNWUwMjI0ZSIsInZlcnNpb24iOjF9.Er21BNgJ3jJXLuZtpubTYq9wCwO1i_VLQFwS5ET0e4eAYVVj0aOA40I5FvP5pZac3LjkCnVacxzsFWGCYVmnDA
48
+ - task:
49
+ type: question-answering
50
+ name: Question Answering
51
+ dataset:
52
+ name: squad
53
+ type: squad
54
+ config: plain_text
55
+ split: validation
56
+ metrics:
57
+ - type: exact_match
58
+ value: 86.225
59
+ name: Exact Match
60
+ - type: f1
61
+ value: 92.483
62
+ name: F1
63
+ - task:
64
+ type: question-answering
65
+ name: Question Answering
66
+ dataset:
67
+ name: adversarial_qa
68
+ type: adversarial_qa
69
+ config: adversarialQA
70
+ split: validation
71
+ metrics:
72
+ - type: exact_match
73
+ value: 29.900
74
+ name: Exact Match
75
+ - type: f1
76
+ value: 41.183
77
+ name: F1
78
+ - task:
79
+ type: question-answering
80
+ name: Question Answering
81
+ dataset:
82
+ name: squad_adversarial
83
+ type: squad_adversarial
84
+ config: AddOneSent
85
+ split: validation
86
+ metrics:
87
+ - type: exact_match
88
+ value: 79.071
89
+ name: Exact Match
90
+ - type: f1
91
+ value: 84.472
92
+ name: F1
93
+ - task:
94
+ type: question-answering
95
+ name: Question Answering
96
+ dataset:
97
+ name: squadshifts amazon
98
+ type: squadshifts
99
+ config: amazon
100
+ split: test
101
+ metrics:
102
+ - type: exact_match
103
+ value: 70.733
104
+ name: Exact Match
105
+ - type: f1
106
+ value: 83.958
107
+ name: F1
108
+ - task:
109
+ type: question-answering
110
+ name: Question Answering
111
+ dataset:
112
+ name: squadshifts new_wiki
113
+ type: squadshifts
114
+ config: new_wiki
115
+ split: test
116
+ metrics:
117
+ - type: exact_match
118
+ value: 82.011
119
+ name: Exact Match
120
+ - type: f1
121
+ value: 91.092
122
+ name: F1
123
+ - task:
124
+ type: question-answering
125
+ name: Question Answering
126
+ dataset:
127
+ name: squadshifts nyt
128
+ type: squadshifts
129
+ config: nyt
130
+ split: test
131
+ metrics:
132
+ - type: exact_match
133
+ value: 84.203
134
+ name: Exact Match
135
+ - type: f1
136
+ value: 91.521
137
+ name: F1
138
+ - task:
139
+ type: question-answering
140
+ name: Question Answering
141
+ dataset:
142
+ name: squadshifts reddit
143
+ type: squadshifts
144
+ config: reddit
145
+ split: test
146
+ metrics:
147
+ - type: exact_match
148
+ value: 72.029
149
+ name: Exact Match
150
+ - type: f1
151
+ value: 83.454
152
+ name: F1
153
+ ---
154
+
155
+ ## Overview
156
+ **Language model:** deepset/roberta-base-squad2-distilled
157
+ **Language:** English
158
+ **Training data:** SQuAD 2.0 training set
159
+ **Eval data:** SQuAD 2.0 dev set
160
+ **Infrastructure**: 4x V100 GPU
161
+ **Published**: Dec 8th, 2021
162
+
163
+ ## Details
164
+ - haystack's distillation feature was used for training. deepset/roberta-large-squad2 was used as the teacher model.
165
+
166
+ ## Hyperparameters
167
+ ```
168
+ batch_size = 80
169
+ n_epochs = 4
170
+ max_seq_len = 384
171
+ learning_rate = 3e-5
172
+ lr_schedule = LinearWarmup
173
+ embeds_dropout_prob = 0.1
174
+ temperature = 1.5
175
+ distillation_loss_weight = 0.75
176
+ ```
177
+ ## Performance
178
+ ```
179
+ "exact": 79.8366040596311
180
+ "f1": 83.916407079888
181
+ ```
182
+
183
+ ## Authors
184
+ **Timo Möller:** timo.moeller@deepset.ai
185
+ **Julian Risch:** julian.risch@deepset.ai
186
+ **Malte Pietsch:** malte.pietsch@deepset.ai
187
+ **Michel Bartels:** michel.bartels@deepset.ai
188
+
189
+ ## About us
190
+ <div class="grid lg:grid-cols-2 gap-x-4 gap-y-3">
191
+ <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
192
+ <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/deepset-logo-colored.png" class="w-40"/>
193
+ </div>
194
+ <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
195
+ <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/haystack-logo-colored.png" class="w-40"/>
196
+ </div>
197
+ </div>
198
+
199
+ [deepset](http://deepset.ai/) is the company behind the open-source NLP framework [Haystack](https://haystack.deepset.ai/) which is designed to help you build production ready NLP systems that use: Question answering, summarization, ranking etc.
200
+
201
+
202
+ Some of our other work:
203
+ - [Distilled roberta-base-squad2 (aka "tinyroberta-squad2")]([https://huggingface.co/deepset/tinyroberta-squad2)
204
+ - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert)
205
+ - [GermanQuAD and GermanDPR datasets and models (aka "gelectra-base-germanquad", "gbert-base-germandpr")](https://deepset.ai/germanquad)
206
+
207
+ ## Get in touch and join the Haystack community
208
+
209
+ <p>For more info on Haystack, visit our <strong><a href="https://github.com/deepset-ai/haystack">GitHub</a></strong> repo and <strong><a href="https://docs.haystack.deepset.ai">Documentation</a></strong>.
210
+
211
+ We also have a <strong><a class="h-7" href="https://haystack.deepset.ai/community">Discord community open to everyone!</a></strong></p>
212
+
213
+ [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Discord](https://haystack.deepset.ai/community) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
214
+
215
+ By the way: [we're hiring!](http://www.deepset.ai/jobs)
216
+