MichelBartelsDeepset commited on
Commit
9cff97d
1 Parent(s): e001979

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +89 -0
README.md ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ datasets:
4
+ - squad_v2
5
+ license: cc-by-4.0
6
+ ---
7
+
8
+ # tinyroberta-squad2
9
+
10
+ ## Overview
11
+ **Language model:** tinyroberta-squad2
12
+ **Language:** English
13
+ **Training data:** The PILE
14
+ **Code:**
15
+ **Infrastructure**: 4x Tesla v100
16
+
17
+ ## Hyperparameters
18
+
19
+ ```
20
+ batch_size = 96
21
+ n_epochs = 4
22
+ base_LM_model = "deepset/tinyroberta-squad2-step1"
23
+ max_seq_len = 384
24
+ learning_rate = 1e-4
25
+ lr_schedule = LinearWarmup
26
+ warmup_proportion = 0.2
27
+ teacher = "deepset/roberta-base"
28
+ ```
29
+
30
+ ## Distillation
31
+ This model was distilled using the TinyBERT approach described in [this paper](https://arxiv.org/pdf/1909.10351.pdf) and implemented in [haystack](https://github.com/deepset-ai/haystack).
32
+ We have performed intermediate layer distillation with roberta-base as the teacher which resulted in [deepset/tinyroberta-6l-768d](https://huggingface.co/deepset/tinyroberta-6l-768d).
33
+ This model has not been distilled for any specific task. If you are interested in using distillation to improve its performance on a downstream task, you can take advantage of haystack's new [distillation functionality](https://haystack.deepset.ai/guides/model-distillation). You can also check out [deepset/tinyroberta-squad2](https://huggingface.co/deepset/tinyroberta-squad2) for a model that is already distilled on an extractive QA downstream task.
34
+
35
+ ## Usage
36
+
37
+ ### In Transformers
38
+ ```python
39
+ from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
40
+
41
+ model_name = "deepset/tinyroberta-squad2"
42
+
43
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
44
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
45
+ ```
46
+
47
+ ### In FARM
48
+
49
+ ```python
50
+ from farm.modeling.adaptive_model import AdaptiveModel
51
+ from farm.modeling.tokenization import Tokenizer
52
+ from farm.infer import Inferencer
53
+
54
+ model_name = "deepset/tinyroberta-squad2"
55
+ model = AdaptiveModel.convert_from_transformers(model_name, device="cpu", task_type="question_answering")
56
+ tokenizer = Tokenizer.load(model_name)
57
+ ```
58
+
59
+ ### In haystack
60
+ For doing QA at scale (i.e. many docs instead of single paragraph), you can load the model also in [haystack](https://github.com/deepset-ai/haystack/):
61
+ ```python
62
+ reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2")
63
+ # or
64
+ reader = TransformersReader(model_name_or_path="deepset/roberta-base-squad2",tokenizer="deepset/roberta-base-squad2")
65
+ ```
66
+
67
+
68
+ ## Authors
69
+ Branden Chan: `branden.chan [at] deepset.ai`
70
+ Timo Möller: `timo.moeller [at] deepset.ai`
71
+ Malte Pietsch: `malte.pietsch [at] deepset.ai`
72
+ Tanay Soni: `tanay.soni [at] deepset.ai`
73
+ Michel Bartels: `michel.bartels [at]Â deepset.ai`
74
+
75
+ ## About us
76
+ ![deepset logo](https://workablehr.s3.amazonaws.com/uploads/account/logo/476306/logo)
77
+ We bring NLP to the industry via open source!
78
+ Our focus: Industry specific language models & large scale QA systems.
79
+
80
+ Some of our work:
81
+ - [German BERT (aka "bert-base-german-cased")](https://deepset.ai/german-bert)
82
+ - [GermanQuAD and GermanDPR datasets and models (aka "gelectra-base-germanquad", "gbert-base-germandpr")](https://deepset.ai/germanquad)
83
+ - [FARM](https://github.com/deepset-ai/FARM)
84
+ - [Haystack](https://github.com/deepset-ai/haystack/)
85
+
86
+ Get in touch:
87
+ [Twitter](https://twitter.com/deepset_ai) | [LinkedIn](https://www.linkedin.com/company/deepset-ai/) | [Slack](https://haystack.deepset.ai/community/join) | [GitHub Discussions](https://github.com/deepset-ai/haystack/discussions) | [Website](https://deepset.ai)
88
+
89
+ By the way: [we're hiring!](http://www.deepset.ai/jobs)