nbroad HF staff autoevaluator HF staff commited on
Commit
4b1d92d
1 Parent(s): b2d15c1

Add evaluation results on the squad_v2 config of squad_v2 (#1)

Browse files

- Add evaluation results on the squad_v2 config of squad_v2 (a0f1cf129c7b9df44f09b8bc8ef7f2f3526d8449)


Co-authored-by: Evaluation Bot <autoevaluator@users.noreply.huggingface.co>

Files changed (1) hide show
  1. README.md +77 -41
README.md CHANGED
@@ -1,41 +1,77 @@
1
- ---
2
- license: cc-by-4.0
3
- widget:
4
- - context: "DeBERTa improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa on a majority of NLU tasks with 80GB training data. In DeBERTa V3, we further improved the efficiency of DeBERTa using ELECTRA-Style pre-training with Gradient Disentangled Embedding Sharing. Compared to DeBERTa, our V3 version significantly improves the model performance on downstream tasks. You can find more technique details about the new model from our paper. Please check the official repository for more implementation details and updates."
5
- example_title: "DeBERTa v3 Q1"
6
- text: "How is DeBERTa version 3 different than previous ones?"
7
- - context: "DeBERTa improves the BERT and RoBERTa models using disentangled attention and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa on a majority of NLU tasks with 80GB training data. In DeBERTa V3, we further improved the efficiency of DeBERTa using ELECTRA-Style pre-training with Gradient Disentangled Embedding Sharing. Compared to DeBERTa, our V3 version significantly improves the model performance on downstream tasks. You can find more technique details about the new model from our paper. Please check the official repository for more implementation details and updates."
8
- example_title: "DeBERTa v3 Q2"
9
- text: "Where do I go to see new info about DeBERTa?"
10
- datasets:
11
- - squad_v2
12
- metrics:
13
- - f1
14
- - exact
15
- tags:
16
- - question-answering
17
- language: en
18
- model-index:
19
- - name: DeBERTa v3 xsmall squad2
20
- results:
21
- - task:
22
- name: Question Answering
23
- type: question-answering
24
- dataset:
25
- name: SQuAD2.0
26
- type: question-answering
27
- metrics:
28
- - name: f1
29
- type: f1
30
- value: 81.5
31
- - name: exact
32
- type: exact
33
- value: 78.3
34
- ---
35
-
36
-
37
- # DeBERTa v3 xsmall SQuAD 2.0
38
-
39
- [Microsoft reports that this model can get 84.8/82.0](https://huggingface.co/microsoft/deberta-v3-xsmall#fine-tuning-on-nlu-tasks) on f1/em on the dev set.
40
-
41
- I got 81.5/78.3 but I only did one run and I didn't use the official squad2 evaluation script. I will do some more runs and show the results on the official script soon.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-4.0
3
+ widget:
4
+ - context: DeBERTa improves the BERT and RoBERTa models using disentangled attention
5
+ and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa
6
+ on a majority of NLU tasks with 80GB training data. In DeBERTa V3, we further
7
+ improved the efficiency of DeBERTa using ELECTRA-Style pre-training with Gradient
8
+ Disentangled Embedding Sharing. Compared to DeBERTa, our V3 version significantly
9
+ improves the model performance on downstream tasks. You can find more technique
10
+ details about the new model from our paper. Please check the official repository
11
+ for more implementation details and updates.
12
+ example_title: DeBERTa v3 Q1
13
+ text: How is DeBERTa version 3 different than previous ones?
14
+ - context: DeBERTa improves the BERT and RoBERTa models using disentangled attention
15
+ and enhanced mask decoder. With those two improvements, DeBERTa out perform RoBERTa
16
+ on a majority of NLU tasks with 80GB training data. In DeBERTa V3, we further
17
+ improved the efficiency of DeBERTa using ELECTRA-Style pre-training with Gradient
18
+ Disentangled Embedding Sharing. Compared to DeBERTa, our V3 version significantly
19
+ improves the model performance on downstream tasks. You can find more technique
20
+ details about the new model from our paper. Please check the official repository
21
+ for more implementation details and updates.
22
+ example_title: DeBERTa v3 Q2
23
+ text: Where do I go to see new info about DeBERTa?
24
+ datasets:
25
+ - squad_v2
26
+ metrics:
27
+ - f1
28
+ - exact
29
+ tags:
30
+ - question-answering
31
+ language: en
32
+ model-index:
33
+ - name: DeBERTa v3 xsmall squad2
34
+ results:
35
+ - task:
36
+ name: Question Answering
37
+ type: question-answering
38
+ dataset:
39
+ name: SQuAD2.0
40
+ type: question-answering
41
+ metrics:
42
+ - name: f1
43
+ type: f1
44
+ value: 81.5
45
+ - name: exact
46
+ type: exact
47
+ value: 78.3
48
+ - task:
49
+ type: question-answering
50
+ name: Question Answering
51
+ dataset:
52
+ name: squad_v2
53
+ type: squad_v2
54
+ config: squad_v2
55
+ split: validation
56
+ metrics:
57
+ - name: Exact Match
58
+ type: exact_match
59
+ value: 78.5341
60
+ verified: true
61
+ - name: F1
62
+ type: f1
63
+ value: 81.6408
64
+ verified: true
65
+ - name: total
66
+ type: total
67
+ value: 11870
68
+ verified: true
69
+ ---
70
+
71
+
72
+
73
+ # DeBERTa v3 xsmall SQuAD 2.0
74
+
75
+ [Microsoft reports that this model can get 84.8/82.0](https://huggingface.co/microsoft/deberta-v3-xsmall#fine-tuning-on-nlu-tasks) on f1/em on the dev set.
76
+
77
+ I got 81.5/78.3 but I only did one run and I didn't use the official squad2 evaluation script. I will do some more runs and show the results on the official script soon.