julianrisch commited on
Commit
3a28cce
verified
1 Parent(s): d773488

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -44
README.md CHANGED
@@ -132,7 +132,7 @@ model-index:
132
  name: F1
133
  ---
134
 
135
- # electra-base for QA
136
 
137
  ## Overview
138
  **Language model:** electra-base
@@ -140,7 +140,7 @@ model-index:
140
  **Downstream-task:** Extractive QA
141
  **Training data:** SQuAD 2.0
142
  **Eval data:** SQuAD 2.0
143
- **Code:** See [example](https://github.com/deepset-ai/FARM/blob/master/examples/question_answering.py) in [FARM](https://github.com/deepset-ai/FARM/blob/master/examples/question_answering.py)
144
  **Infrastructure**: 1x Tesla v100
145
 
146
  ## Hyperparameters
@@ -172,19 +172,43 @@ Evaluated on the SQuAD 2.0 dev set with the [official eval script](https://works
172
  "NoAns_total": 5945
173
  ```
174
 
 
175
  ## Usage
176
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
177
  ### In Transformers
178
  ```python
179
  from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
180
 
181
- model_name = "deepset/electra-base-squad2"
182
 
183
  # a) Get predictions
184
  nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
185
  QA_input = {
186
  'question': 'Why is model conversion important?',
187
- 'context': 'The option to convert models between FARM and transformers gives freedom to the user and lets people easily switch between frameworks.'
188
  }
189
  res = nlp(QA_input)
190
 
@@ -193,35 +217,6 @@ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
193
  tokenizer = AutoTokenizer.from_pretrained(model_name)
194
  ```
195
 
196
- ### In FARM
197
-
198
- ```python
199
- from farm.modeling.adaptive_model import AdaptiveModel
200
- from farm.modeling.tokenization import Tokenizer
201
- from farm.infer import Inferencer
202
-
203
- model_name = "deepset/electra-base-squad2"
204
-
205
- # a) Get predictions
206
- nlp = Inferencer.load(model_name, task_type="question_answering")
207
- QA_input = [{"questions": ["Why is model conversion important?"],
208
- "text": "The option to convert models between FARM and transformers gives freedom to the user and lets people easily switch between frameworks."}]
209
- res = nlp.inference_from_dicts(dicts=QA_input)
210
-
211
- # b) Load model & tokenizer
212
- model = AdaptiveModel.convert_from_transformers(model_name, device="cpu", task_type="question_answering")
213
- tokenizer = Tokenizer.load(model_name)
214
- ```
215
-
216
- ### In haystack
217
- For doing QA at scale (i.e. many docs instead of a single paragraph), you can load the model also in [haystack](https://github.com/deepset-ai/haystack/):
218
- ```python
219
- reader = FARMReader(model_name_or_path="deepset/electra-base-squad2")
220
- # or
221
- reader = TransformersReader(model="deepset/electra-base-squad2",tokenizer="deepset/electra-base-squad2")
222
- ```
223
-
224
-
225
  ## Authors
226
  Vaishali Pal `vaishali.pal [at] deepset.ai`
227
  Branden Chan: `branden.chan [at] deepset.ai`
@@ -230,18 +225,29 @@ Malte Pietsch: `malte.pietsch [at] deepset.ai`
230
  Tanay Soni: `tanay.soni [at] deepset.ai`
231
 
232
  ## About us
233
- ![deepset logo](https://workablehr.s3.amazonaws.com/uploads/account/logo/476306/logo)
234
 
235
- We bring NLP to the industry via open source!
236
- Our focus: Industry specific language models & large scale QA systems.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
237
 
238
- Some of our work:
239
- - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert)
240
- - [GermanQuAD and GermanDPR datasets and models (aka "gelectra-base-germanquad", "gbert-base-germandpr")](https://deepset.ai/germanquad)
241
- - [FARM](https://github.com/deepset-ai/FARM)
242
- - [Haystack](https://github.com/deepset-ai/haystack/)
243
 
244
- Get in touch:
245
- [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Discord](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
246
 
247
- By the way: [we're hiring!](http://www.deepset.ai/jobs)
 
132
  name: F1
133
  ---
134
 
135
+ # electra-base for Extractive QA
136
 
137
  ## Overview
138
  **Language model:** electra-base
 
140
  **Downstream-task:** Extractive QA
141
  **Training data:** SQuAD 2.0
142
  **Eval data:** SQuAD 2.0
143
+ **Code:** See [an example extractive QA pipeline built with Haystack](https://haystack.deepset.ai/tutorials/34_extractive_qa_pipeline)
144
  **Infrastructure**: 1x Tesla v100
145
 
146
  ## Hyperparameters
 
172
  "NoAns_total": 5945
173
  ```
174
 
175
+
176
  ## Usage
177
 
178
+ ### In Haystack
179
+ Haystack is an AI orchestration framework to build customizable, production-ready LLM applications. You can use this model in Haystack to do extractive question answering on documents.
180
+ To load and run the model with [Haystack](https://github.com/deepset-ai/haystack/):
181
+ ```python
182
+ # After running pip install haystack-ai "transformers[torch,sentencepiece]"
183
+
184
+ from haystack import Document
185
+ from haystack.components.readers import ExtractiveReader
186
+
187
+ docs = [
188
+ Document(content="Python is a popular programming language"),
189
+ Document(content="python ist eine beliebte Programmiersprache"),
190
+ ]
191
+
192
+ reader = ExtractiveReader(model="deepset/roberta-base-squad2")
193
+ reader.warm_up()
194
+
195
+ question = "What is a popular programming language?"
196
+ result = reader.run(query=question, documents=docs)
197
+ # {'answers': [ExtractedAnswer(query='What is a popular programming language?', score=0.5740374326705933, data='python', document=Document(id=..., content: '...'), context=None, document_offset=ExtractedAnswer.Span(start=0, end=6),...)]}
198
+ ```
199
+ For a complete example with an extractive question answering pipeline that scales over many documents, check out the [corresponding Haystack tutorial](https://haystack.deepset.ai/tutorials/34_extractive_qa_pipeline).
200
+
201
  ### In Transformers
202
  ```python
203
  from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
204
 
205
+ model_name = "deepset/roberta-base-squad2"
206
 
207
  # a) Get predictions
208
  nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
209
  QA_input = {
210
  'question': 'Why is model conversion important?',
211
+ 'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
212
  }
213
  res = nlp(QA_input)
214
 
 
217
  tokenizer = AutoTokenizer.from_pretrained(model_name)
218
  ```
219
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
220
  ## Authors
221
  Vaishali Pal `vaishali.pal [at] deepset.ai`
222
  Branden Chan: `branden.chan [at] deepset.ai`
 
225
  Tanay Soni: `tanay.soni [at] deepset.ai`
226
 
227
  ## About us
 
228
 
229
+ <div class="grid lg:grid-cols-2 gap-x-4 gap-y-3">
230
+ <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
231
+ <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/deepset-logo-colored.png" class="w-40"/>
232
+ </div>
233
+ <div class="w-full h-40 object-cover mb-2 rounded-lg flex items-center justify-center">
234
+ <img alt="" src="https://raw.githubusercontent.com/deepset-ai/.github/main/haystack-logo-colored.png" class="w-40"/>
235
+ </div>
236
+ </div>
237
+
238
+ [deepset](http://deepset.ai/) is the company behind the production-ready open-source AI framework [Haystack](https://haystack.deepset.ai/).
239
+
240
+ Some of our other work:
241
+ - [Distilled roberta-base-squad2 (aka "tinyroberta-squad2")](https://huggingface.co/deepset/tinyroberta-squad2)
242
+ - [German BERT](https://deepset.ai/german-bert), [GermanQuAD and GermanDPR](https://deepset.ai/germanquad), [German embedding model](https://huggingface.co/mixedbread-ai/deepset-mxbai-embed-de-large-v1)
243
+ - [deepset Cloud](https://www.deepset.ai/deepset-cloud-product), [deepset Studio](https://www.deepset.ai/deepset-studio)
244
+
245
+ ## Get in touch and join the Haystack community
246
+
247
+ <p>For more info on Haystack, visit our <strong><a href="https://github.com/deepset-ai/haystack">GitHub</a></strong> repo and <strong><a href="https://docs.haystack.deepset.ai">Documentation</a></strong>.
248
 
249
+ We also have a <strong><a class="h-7" href="https://haystack.deepset.ai/community">Discord community open to everyone!</a></strong></p>
 
 
 
 
250
 
251
+ [Twitter](https://twitter.com/Haystack_AI) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Discord](https://haystack.deepset.ai/community) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://haystack.deepset.ai/) | [YouTube](https://www.youtube.com/@deepset_ai)
 
252
 
253
+ By the way: [we're hiring!](http://www.deepset.ai/jobs)