Add evaluation results on scientific_papers dataset

#1
by autoevaluator HF staff - opened
Files changed (1) hide show
  1. README.md +139 -1
README.md CHANGED
@@ -1,11 +1,149 @@
1
  ---
2
-
3
  language: en
4
  license: apache-2.0
5
  datasets:
6
  - scientific_papers
7
  tags:
8
  - summarization
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9
  ---
10
 
11
  # BigBirdPegasus model (large)
 
1
  ---
 
2
  language: en
3
  license: apache-2.0
4
  datasets:
5
  - scientific_papers
6
  tags:
7
  - summarization
8
+ model-index:
9
+ - name: google/bigbird-pegasus-large-arxiv
10
+ results:
11
+ - task:
12
+ type: summarization
13
+ name: Summarization
14
+ dataset:
15
+ name: scientific_papers
16
+ type: scientific_papers
17
+ config: pubmed
18
+ split: test
19
+ metrics:
20
+ - name: ROUGE-1
21
+ type: rouge
22
+ value: 36.0276
23
+ verified: true
24
+ - name: ROUGE-2
25
+ type: rouge
26
+ value: 13.4166
27
+ verified: true
28
+ - name: ROUGE-L
29
+ type: rouge
30
+ value: 21.9612
31
+ verified: true
32
+ - name: ROUGE-LSUM
33
+ type: rouge
34
+ value: 29.648
35
+ verified: true
36
+ - name: loss
37
+ type: loss
38
+ value: 2.774355173110962
39
+ verified: true
40
+ - name: meteor
41
+ type: meteor
42
+ value: 0.2824
43
+ verified: true
44
+ - name: gen_len
45
+ type: gen_len
46
+ value: 209.2537
47
+ verified: true
48
+ - task:
49
+ type: summarization
50
+ name: Summarization
51
+ dataset:
52
+ name: cnn_dailymail
53
+ type: cnn_dailymail
54
+ config: 3.0.0
55
+ split: test
56
+ metrics:
57
+ - name: ROUGE-1
58
+ type: rouge
59
+ value: 9.0885
60
+ verified: true
61
+ - name: ROUGE-2
62
+ type: rouge
63
+ value: 1.0325
64
+ verified: true
65
+ - name: ROUGE-L
66
+ type: rouge
67
+ value: 7.3182
68
+ verified: true
69
+ - name: ROUGE-LSUM
70
+ type: rouge
71
+ value: 8.1455
72
+ verified: true
73
+ - name: loss
74
+ type: loss
75
+ value: .nan
76
+ verified: true
77
+ - name: gen_len
78
+ type: gen_len
79
+ value: 210.4762
80
+ verified: true
81
+ - task:
82
+ type: summarization
83
+ name: Summarization
84
+ dataset:
85
+ name: xsum
86
+ type: xsum
87
+ config: default
88
+ split: test
89
+ metrics:
90
+ - name: ROUGE-1
91
+ type: rouge
92
+ value: 4.9787
93
+ verified: true
94
+ - name: ROUGE-2
95
+ type: rouge
96
+ value: 0.3527
97
+ verified: true
98
+ - name: ROUGE-L
99
+ type: rouge
100
+ value: 4.3679
101
+ verified: true
102
+ - name: ROUGE-LSUM
103
+ type: rouge
104
+ value: 4.1723
105
+ verified: true
106
+ - name: loss
107
+ type: loss
108
+ value: .nan
109
+ verified: true
110
+ - name: gen_len
111
+ type: gen_len
112
+ value: 230.4886
113
+ verified: true
114
+ - task:
115
+ type: summarization
116
+ name: Summarization
117
+ dataset:
118
+ name: scientific_papers
119
+ type: scientific_papers
120
+ config: arxiv
121
+ split: test
122
+ metrics:
123
+ - name: ROUGE-1
124
+ type: rouge
125
+ value: 43.4702
126
+ verified: true
127
+ - name: ROUGE-2
128
+ type: rouge
129
+ value: 17.4297
130
+ verified: true
131
+ - name: ROUGE-L
132
+ type: rouge
133
+ value: 26.2587
134
+ verified: true
135
+ - name: ROUGE-LSUM
136
+ type: rouge
137
+ value: 35.5587
138
+ verified: true
139
+ - name: loss
140
+ type: loss
141
+ value: 2.1113228797912598
142
+ verified: true
143
+ - name: gen_len
144
+ type: gen_len
145
+ value: 183.3702
146
+ verified: true
147
  ---
148
 
149
  # BigBirdPegasus model (large)