abhishek
HF staff
commited on
Commit
da0dca3
1 Parent(s): 55f272a

Add evaluation results on the squad_v2 config of squad_v2

Browse files

Beep boop, I am a bot from Hugging Face's automatic model evaluator 👋!\
Your model has been evaluated on the squad_v2 config of the [squad_v2](https://huggingface.co/datasets/squad_v2) dataset by @lewtun, using the predictions stored [here](https://huggingface.co/datasets/autoevaluate/autoeval-staging-eval-project-e81e3618-f3e1-472b-97e0-2794cda0adb2-409).\
Accept this pull request to see the results displayed on the [Hub leaderboard](https://huggingface.co/spaces/autoevaluate/leaderboards?dataset=squad_v2).\
Evaluate your model on more datasets [here](https://huggingface.co/spaces/autoevaluate/model-evaluator?dataset=squad_v2).

Files changed (1) hide show
  1. README.md +52 -0
README.md CHANGED
@@ -3,6 +3,58 @@ language: en
3
  datasets:
4
  - squad_v2
5
  license: cc-by-4.0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6
  ---
7
 
8
  # roberta-base for QA
3
  datasets:
4
  - squad_v2
5
  license: cc-by-4.0
6
+ model-index:
7
+ - name: autoevaluate/roberta-base-squad2
8
+ results:
9
+ - task:
10
+ type: question-answering
11
+ name: Question Answering
12
+ dataset:
13
+ name: squad_v2
14
+ type: squad_v2
15
+ config: squad_v2
16
+ split: validation
17
+ metrics:
18
+ - name: Exact Match
19
+ type: exact_match
20
+ value: 79.9309
21
+ verified: true
22
+ - name: F1
23
+ type: f1
24
+ value: 82.9433
25
+ verified: true
26
+ - name: total
27
+ type: total
28
+ value: 11869
29
+ verified: true
30
+ - name: HasAns_exact
31
+ type: HasAns_exact
32
+ value: 79.9309
33
+ verified: true
34
+ - name: HasAns_f1
35
+ type: HasAns_f1
36
+ value: 82.9433
37
+ verified: true
38
+ - name: HasAns_total
39
+ type: HasAns_total
40
+ value: 11869
41
+ verified: true
42
+ - name: best_exact
43
+ type: best_exact
44
+ value: 79.9309
45
+ verified: true
46
+ - name: best_exact_thresh
47
+ type: best_exact_thresh
48
+ value: 0.0
49
+ verified: true
50
+ - name: best_f1
51
+ type: best_f1
52
+ value: 82.9433
53
+ verified: true
54
+ - name: best_f1_thresh
55
+ type: best_f1_thresh
56
+ value: 0.0
57
+ verified: true
58
  ---
59
 
60
  # roberta-base for QA