weijiang2009 commited on
Commit
0f7e2a5
1 Parent(s): 9180aa7

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +106 -0
README.md ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: en
3
+ license: cc-by-4.0
4
+ datasets:
5
+ - squad_v2
6
+ model-index:
7
+ - name: weijiang2009/AlgmonQuestingAnsweringModel
8
+ results:
9
+ - task:
10
+ type: question-answering
11
+ name: Question Answering
12
+ dataset:
13
+ name: squad_v2
14
+ type: squad_v2
15
+ config: squad_v2
16
+ split: validation
17
+ metrics:
18
+ - type: exact_match
19
+ value: 79.9309
20
+ name: Exact Match
21
+ verified: true
22
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMDhhNjg5YzNiZGQ1YTIyYTAwZGUwOWEzZTRiYzdjM2QzYjA3ZTUxNDM1NjE1MTUyMjE1MGY1YzEzMjRjYzVjYiIsInZlcnNpb24iOjF9.EH5JJo8EEFwU7osPz3s7qanw_tigeCFhCXjSfyN0Y1nWVnSfulSxIk_DbAEI5iE80V4EKLyp5-mYFodWvL2KDA
23
+ - type: f1
24
+ value: 82.9501
25
+ name: F1
26
+ verified: true
27
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjk5ZDYwOGQyNjNkMWI0OTE4YzRmOTlkY2JjNjQ0YTZkNTMzMzNkYTA0MDFmNmI3NjA3NjNlMjhiMDQ2ZjJjNSIsInZlcnNpb24iOjF9.DDm0LNTkdLbGsue58bg1aH_s67KfbcmkvL-6ZiI2s8IoxhHJMSf29H_uV2YLyevwx900t-MwTVOW3qfFnMMEAQ
28
+ - type: total
29
+ value: 11869
30
+ name: total
31
+ verified: true
32
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMGFkMmI2ODM0NmY5NGNkNmUxYWViOWYxZDNkY2EzYWFmOWI4N2VhYzY5MGEzMTVhOTU4Zjc4YWViOGNjOWJjMCIsInZlcnNpb24iOjF9.fexrU1icJK5_MiifBtZWkeUvpmFISqBLDXSQJ8E6UnrRof-7cU0s4tX_dIsauHWtUpIHMPZCf5dlMWQKXZuAAA
33
+ ---
34
+
35
+ # roberta-base for QA
36
+
37
+ This is the [roberta-base](https://huggingface.co/roberta-base) model, fine-tuned using the [SQuAD2.0](https://huggingface.co/datasets/squad_v2) dataset. It's been trained on question-answer pairs, including unanswerable questions, for the task of Question Answering.
38
+
39
+
40
+ ## Overview
41
+ **Language model:** roberta-base
42
+ **Language:** English
43
+ **Downstream-task:** Extractive QA
44
+ **Training data:** SQuAD 2.0
45
+ **Eval data:** SQuAD 2.0
46
+ **Infrastructure**: 4x Tesla v100
47
+
48
+ ## Hyperparameters
49
+
50
+ ```
51
+ batch_size = 96
52
+ n_epochs = 2
53
+ base_LM_model = "roberta-base"
54
+ max_seq_len = 386
55
+ learning_rate = 3e-5
56
+ lr_schedule = LinearWarmup
57
+ warmup_proportion = 0.2
58
+ doc_stride=128
59
+ max_query_length=64
60
+ ```
61
+
62
+ ## Usage
63
+
64
+ ### In Haystack
65
+ Haystack is an NLP framework by deepset. You can use this model in a Haystack pipeline to do question answering at scale (over many documents). To load the model in [Haystack](https://github.com/deepset-ai/haystack/):
66
+ ```python
67
+ reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2")
68
+ # or
69
+ reader = TransformersReader(model_name_or_path="deepset/roberta-base-squad2",tokenizer="deepset/roberta-base-squad2")
70
+ ```
71
+ For a complete example of ``roberta-base-squad2`` being used for Question Answering, check out the [Tutorials in Haystack Documentation](https://haystack.deepset.ai/tutorials/first-qa-system)
72
+
73
+ ### In Transformers
74
+ ```python
75
+ from transformers import AutoModelForQuestionAnswering, AutoTokenizer, pipeline
76
+
77
+ model_name = "deepset/roberta-base-squad2"
78
+
79
+ # a) Get predictions
80
+ nlp = pipeline('question-answering', model=model_name, tokenizer=model_name)
81
+ QA_input = {
82
+ 'question': 'Why is model conversion important?',
83
+ 'context': 'The option to convert models between FARM and transformers gives freedom to the user and let people easily switch between frameworks.'
84
+ }
85
+ res = nlp(QA_input)
86
+
87
+ # b) Load model & tokenizer
88
+ model = AutoModelForQuestionAnswering.from_pretrained(model_name)
89
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
90
+ ```
91
+
92
+ ## Performance
93
+ Evaluated on the SQuAD 2.0 dev set with the [official eval script](https://worksheets.codalab.org/rest/bundles/0x6b567e1cf2e041ec80d7098f031c5c9e/contents/blob/).
94
+
95
+ ```
96
+ "exact": 79.87029394424324,
97
+ "f1": 82.91251169582613,
98
+
99
+ "total": 11873,
100
+ "HasAns_exact": 77.93522267206478,
101
+ "HasAns_f1": 84.02838248389763,
102
+ "HasAns_total": 5928,
103
+ "NoAns_exact": 81.79983179142137,
104
+ "NoAns_f1": 81.79983179142137,
105
+ "NoAns_total": 5945
106
+ ```