RichardErkhov
commited on
Commit
•
8a9403e
1
Parent(s):
b07aee9
uploaded readme
Browse files
README.md
ADDED
@@ -0,0 +1,216 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
Quantization made by Richard Erkhov.
|
2 |
+
|
3 |
+
[Github](https://github.com/RichardErkhov)
|
4 |
+
|
5 |
+
[Discord](https://discord.gg/pvy7H8DZMG)
|
6 |
+
|
7 |
+
[Request more models](https://github.com/RichardErkhov/quant_request)
|
8 |
+
|
9 |
+
|
10 |
+
roberta-base-squad2-distilled - bnb 4bits
|
11 |
+
- Model creator: https://huggingface.co/deepset/
|
12 |
+
- Original model: https://huggingface.co/deepset/roberta-base-squad2-distilled/
|
13 |
+
|
14 |
+
|
15 |
+
|
16 |
+
|
17 |
+
Original model description:
|
18 |
+
---
|
19 |
+
language: en
|
20 |
+
license: mit
|
21 |
+
tags:
|
22 |
+
- exbert
|
23 |
+
datasets:
|
24 |
+
- squad_v2
|
25 |
+
thumbnail: https://thumb.tildacdn.com/tild3433-3637-4830-a533-353833613061/-/resize/720x/-/format/webp/germanquad.jpg
|
26 |
+
model-index:
|
27 |
+
- name: deepset/roberta-base-squad2-distilled
|
28 |
+
results:
|
29 |
+
- task:
|
30 |
+
type: question-answering
|
31 |
+
name: Question Answering
|
32 |
+
dataset:
|
33 |
+
name: squad_v2
|
34 |
+
type: squad_v2
|
35 |
+
config: squad_v2
|
36 |
+
split: validation
|
37 |
+
metrics:
|
38 |
+
- type: exact_match
|
39 |
+
value: 80.8593
|
40 |
+
name: Exact Match
|
41 |
+
verified: true
|
42 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMzVjNzkxNmNiNDkzNzdiYjJjZGM3ZTViMGJhOGM2ZjFmYjg1MjYxMDM2YzM5NWMwNDIyYzNlN2QwNGYyNDMzZSIsInZlcnNpb24iOjF9.Rgww8tf8D7nF2dh2U_DMrFzmp87k8s7RFibrDXSvQyA66PGWXwjlsd1552lzjHnNV5hvHUM1-h3PTuY_5p64BA
|
43 |
+
- type: f1
|
44 |
+
value: 84.0104
|
45 |
+
name: F1
|
46 |
+
verified: true
|
47 |
+
verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiNTAyZDViNWYzNjA4OWQ5MzgyYmQ2ZDlhNWRhMTIzYTYxYzViMmI4NWE4ZGU5MzVhZTAwNTRlZmRlNWUwMjI0ZSIsInZlcnNpb24iOjF9.Er21BNgJ3jJXLuZtpubTYq9wCwO1i_VLQFwS5ET0e4eAYVVj0aOA40I5FvP5pZac3LjkCnVacxzsFWGCYVmnDA
|
48 |
+
- task:
|
49 |
+
type: question-answering
|
50 |
+
name: Question Answering
|
51 |
+
dataset:
|
52 |
+
name: squad
|
53 |
+
type: squad
|
54 |
+
config: plain_text
|
55 |
+
split: validation
|
56 |
+
metrics:
|
57 |
+
- type: exact_match
|
58 |
+
value: 86.225
|
59 |
+
name: Exact Match
|
60 |
+
- type: f1
|
61 |
+
value: 92.483
|
62 |
+
name: F1
|
63 |
+
- task:
|
64 |
+
type: question-answering
|
65 |
+
name: Question Answering
|
66 |
+
dataset:
|
67 |
+
name: adversarial_qa
|
68 |
+
type: adversarial_qa
|
69 |
+
config: adversarialQA
|
70 |
+
split: validation
|
71 |
+
metrics:
|
72 |
+
- type: exact_match
|
73 |
+
value: 29.900
|
74 |
+
name: Exact Match
|
75 |
+
- type: f1
|
76 |
+
value: 41.183
|
77 |
+
name: F1
|
78 |
+
- task:
|
79 |
+
type: question-answering
|
80 |
+
name: Question Answering
|
81 |
+
dataset:
|
82 |
+
name: squad_adversarial
|
83 |
+
type: squad_adversarial
|
84 |
+
config: AddOneSent
|
85 |
+
split: validation
|
86 |
+
metrics:
|
87 |
+
- type: exact_match
|
88 |
+
value: 79.071
|
89 |
+
name: Exact Match
|
90 |
+
- type: f1
|
91 |
+
value: 84.472
|
92 |
+
name: F1
|
93 |
+
- task:
|
94 |
+
type: question-answering
|
95 |
+
name: Question Answering
|
96 |
+
dataset:
|
97 |
+
name: squadshifts amazon
|
98 |
+
type: squadshifts
|
99 |
+
config: amazon
|
100 |
+
split: test
|
101 |
+
metrics:
|
102 |
+
- type: exact_match
|
103 |
+
value: 70.733
|
104 |
+
name: Exact Match
|
105 |
+
- type: f1
|
106 |
+
value: 83.958
|
107 |
+
name: F1
|
108 |
+
- task:
|
109 |
+
type: question-answering
|
110 |
+
name: Question Answering
|
111 |
+
dataset:
|
112 |
+
name: squadshifts new_wiki
|
113 |
+
type: squadshifts
|
114 |
+
config: new_wiki
|
115 |
+
split: test
|
116 |
+
metrics:
|
117 |
+
- type: exact_match
|
118 |
+
value: 82.011
|
119 |
+
name: Exact Match
|
120 |
+
- type: f1
|
121 |
+
value: 91.092
|
122 |
+
name: F1
|
123 |
+
- task:
|
124 |
+
type: question-answering
|
125 |
+
name: Question Answering
|
126 |
+
dataset:
|
127 |
+
name: squadshifts nyt
|
128 |
+
type: squadshifts
|
129 |
+
config: nyt
|
130 |
+
split: test
|
131 |
+
metrics:
|
132 |
+
- type: exact_match
|
133 |
+
value: 84.203
|
134 |
+
name: Exact Match
|
135 |
+
- type: f1
|
136 |
+
value: 91.521
|
137 |
+
name: F1
|
138 |
+
- task:
|
139 |
+
type: question-answering
|
140 |
+
name: Question Answering
|
141 |
+
dataset:
|
142 |
+
name: squadshifts reddit
|
143 |
+
type: squadshifts
|
144 |
+
config: reddit
|
145 |
+
split: test
|
146 |
+
metrics:
|
147 |
+
- type: exact_match
|
148 |
+
value: 72.029
|
149 |
+
name: Exact Match
|
150 |
+
- type: f1
|
151 |
+
value: 83.454
|
152 |
+
name: F1
|
153 |
+
---
|
154 |
+
|
155 |
+
## Overview
|
156 |
+
**Language model:** deepset/roberta-base-squad2-distilled
|
157 |
+
**Language:** English
|
158 |
+
**Training data:** SQuAD 2.0 training set
|
159 |
+
**Eval data:** SQuAD 2.0 dev set
|
160 |
+
**Infrastructure**: 4x V100 GPU
|
161 |
+
**Published**: Dec 8th, 2021
|
162 |
+
|
163 |
+
## Details
|
164 |
+
- haystack's distillation feature was used for training. deepset/roberta-large-squad2 was used as the teacher model.
|
165 |
+
|
166 |
+
## Hyperparameters
|
167 |
+
```
|
168 |
+
batch_size = 80
|
169 |
+
n_epochs = 4
|
170 |
+
max_seq_len = 384
|
171 |
+
learning_rate = 3e-5
|
172 |
+
lr_schedule = LinearWarmup
|
173 |
+
embeds_dropout_prob = 0.1
|
174 |
+
temperature = 1.5
|
175 |
+
distillation_loss_weight = 0.75
|
176 |
+
```
|
177 |
+
## Performance
|
178 |
+
```
|
179 |
+
"exact": 79.8366040596311
|
180 |
+
"f1": 83.916407079888
|
181 |
+
```
|
182 |
+
|
183 |
+
## Authors
|
184 |
+
**Timo Möller:** timo.moeller@deepset.ai
|
185 |
+
**Julian Risch:** julian.risch@deepset.ai
|
186 |
+
**Malte Pietsch:** malte.pietsch@deepset.ai
|
187 |
+
**Michel Bartels:** michel.bartels@deepset.ai
|
188 |
+
|
189 |
+
## About us
|
190 |
+
<div class="grid lg:grid-cols-2 gap-x-4 gap-y-3">
|
191 |
+
<div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
|
192 |
+
<img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/deepset-logo-colored.png" class="w-40"/>
|
193 |
+
</div>
|
194 |
+
<div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
|
195 |
+
<img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/haystack-logo-colored.png" class="w-40"/>
|
196 |
+
</div>
|
197 |
+
</div>
|
198 |
+
|
199 |
+
[deepset](http://deepset.ai/) is the company behind the open-source NLP framework [Haystack](https://haystack.deepset.ai/) which is designed to help you build production ready NLP systems that use: Question answering, summarization, ranking etc.
|
200 |
+
|
201 |
+
|
202 |
+
Some of our other work:
|
203 |
+
- [Distilled roberta-base-squad2 (aka "tinyroberta-squad2")]([https://huggingface.co/deepset/tinyroberta-squad2)
|
204 |
+
- [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert)
|
205 |
+
- [GermanQuAD and GermanDPR datasets and models (aka "gelectra-base-germanquad", "gbert-base-germandpr")](https://deepset.ai/germanquad)
|
206 |
+
|
207 |
+
## Get in touch and join the Haystack community
|
208 |
+
|
209 |
+
<p>For more info on Haystack, visit our <strong><a href="https://github.com/deepset-ai/haystack">GitHub</a></strong> repo and <strong><a href="https://docs.haystack.deepset.ai">Documentation</a></strong>.
|
210 |
+
|
211 |
+
We also have a <strong><a class="h-7" href="https://haystack.deepset.ai/community">Discord community open to everyone!</a></strong></p>
|
212 |
+
|
213 |
+
[Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Discord](https://haystack.deepset.ai/community) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
|
214 |
+
|
215 |
+
By the way: [we're hiring!](http://www.deepset.ai/jobs)
|
216 |
+
|