Add verifyToken field to verify evaluation results are produced by Hugging Face's automatic model evaluator

#4
by autoevaluator HF staff - opened
Files changed (1) hide show
  1. README.md +59 -31
README.md CHANGED
@@ -1,46 +1,68 @@
1
  ---
2
  language: en
 
3
  tags:
4
  - bart
5
  - seq2seq
6
  - summarization
7
- license: apache-2.0
8
  datasets:
9
  - samsum
10
  widget:
11
- - text: "Hannah: Hey, do you have Betty's number?\nAmanda: Lemme check\nAmanda: Sorry,\
12
- \ can't find it.\nAmanda: Ask Larry\nAmanda: He called her last time we were at\
13
- \ the park together\nHannah: I don't know him well\nAmanda: Don't be shy, he's\
14
- \ very nice\nHannah: If you say so..\nHannah: I'd rather you texted him\nAmanda:\
15
- \ Just text him \U0001F642\nHannah: Urgh.. Alright\nHannah: Bye\nAmanda: Bye bye\n"
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
16
  model-index:
17
  - name: bart-large-xsum-samsum
18
  results:
19
  - task:
20
- name: Abstractive Text Summarization
21
  type: abstractive-text-summarization
 
22
  dataset:
23
  name: 'SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization'
24
  type: samsum
25
  metrics:
26
- - name: Validation ROUGE-1
27
- type: rouge-1
28
  value: 54.3921
29
- - name: Validation ROUGE-2
30
- type: rouge-2
31
  value: 29.8078
32
- - name: Validation ROUGE-L
33
- type: rouge-l
34
  value: 45.1543
35
- - name: Test ROUGE-1
36
- type: rouge-1
37
  value: 53.3059
38
- - name: Test ROUGE-2
39
- type: rouge-2
40
  value: 28.355
41
- - name: Test ROUGE-L
42
- type: rouge-l
43
  value: 44.0953
 
44
  - task:
45
  type: summarization
46
  name: Summarization
@@ -50,30 +72,36 @@ model-index:
50
  config: samsum
51
  split: train
52
  metrics:
53
- - name: ROUGE-1
54
- type: rouge
55
  value: 46.2492
 
56
  verified: true
57
- - name: ROUGE-2
58
- type: rouge
59
  value: 21.346
 
60
  verified: true
61
- - name: ROUGE-L
62
- type: rouge
63
  value: 37.2787
 
64
  verified: true
65
- - name: ROUGE-LSUM
66
- type: rouge
67
  value: 42.1317
 
68
  verified: true
69
- - name: loss
70
- type: loss
71
  value: 1.6859958171844482
 
72
  verified: true
73
- - name: gen_len
74
- type: gen_len
75
  value: 23.7103
 
76
  verified: true
 
77
  ---
78
  ## `bart-large-xsum-samsum`
79
  This model was obtained by fine-tuning `facebook/bart-large-xsum` on [Samsum](https://huggingface.co/datasets/samsum) dataset.
 
1
  ---
2
  language: en
3
+ license: apache-2.0
4
  tags:
5
  - bart
6
  - seq2seq
7
  - summarization
 
8
  datasets:
9
  - samsum
10
  widget:
11
+ - text: 'Hannah: Hey, do you have Betty''s number?
12
+
13
+ Amanda: Lemme check
14
+
15
+ Amanda: Sorry, can''t find it.
16
+
17
+ Amanda: Ask Larry
18
+
19
+ Amanda: He called her last time we were at the park together
20
+
21
+ Hannah: I don''t know him well
22
+
23
+ Amanda: Don''t be shy, he''s very nice
24
+
25
+ Hannah: If you say so..
26
+
27
+ Hannah: I''d rather you texted him
28
+
29
+ Amanda: Just text him πŸ™‚
30
+
31
+ Hannah: Urgh.. Alright
32
+
33
+ Hannah: Bye
34
+
35
+ Amanda: Bye bye
36
+
37
+ '
38
  model-index:
39
  - name: bart-large-xsum-samsum
40
  results:
41
  - task:
 
42
  type: abstractive-text-summarization
43
+ name: Abstractive Text Summarization
44
  dataset:
45
  name: 'SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization'
46
  type: samsum
47
  metrics:
48
+ - type: rouge-1
 
49
  value: 54.3921
50
+ name: Validation ROUGE-1
51
+ - type: rouge-2
52
  value: 29.8078
53
+ name: Validation ROUGE-2
54
+ - type: rouge-l
55
  value: 45.1543
56
+ name: Validation ROUGE-L
57
+ - type: rouge-1
58
  value: 53.3059
59
+ name: Test ROUGE-1
60
+ - type: rouge-2
61
  value: 28.355
62
+ name: Test ROUGE-2
63
+ - type: rouge-l
64
  value: 44.0953
65
+ name: Test ROUGE-L
66
  - task:
67
  type: summarization
68
  name: Summarization
 
72
  config: samsum
73
  split: train
74
  metrics:
75
+ - type: rouge
 
76
  value: 46.2492
77
+ name: ROUGE-1
78
  verified: true
79
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiZjFlZjk0MTQxMDk5ODVlNzA4MjYyNjJiMzlkOGI5MjU0MzM1ZDgxMWFlY2YyODk1Y2QxNDk2ZGZkMmU0YmYyNyIsInZlcnNpb24iOjF9.-ZraFEyEy1BY0h3frazROc1W6DmPtkb0Rvvs_A7KeWUQQlwd4felknl2dLGS3N6K-SZ89yGd6V9QJhAGeUCNDg
80
+ - type: rouge
81
  value: 21.346
82
+ name: ROUGE-2
83
  verified: true
84
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiODhjYjUzMGIzNTBmNzg3NGIwNzliYmUwZjM3ZTdlNjMyYzg4MjU5NDE1NjUwM2Q1MGQ0N2NiZWFkOGUwN2ExMCIsInZlcnNpb24iOjF9.SsmyHQ3u9ATihMR3lyNPGaB6bpe5xLG0pDeWJRyXtda4KUefVE3B2SpvluGTjOcF7ikKHPwNMs65IcRh9PTuDg
85
+ - type: rouge
86
  value: 37.2787
87
+ name: ROUGE-L
88
  verified: true
89
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiN2VhNDllNGI1ODQ2OGM1MWFmOGMwY2JhOTc0NzYxMWI4N2VhM2JkZGY3OGNhMTU2YmQ0MmMzMTk4NWM4NWJmYSIsInZlcnNpb24iOjF9.vLIlnNYOv8ObzVma5-tXPhKgB0ClcSBzRzn2qKep_YWMYkWCLk-AbPZKLTimmuJvzfv7naVXLtJomZlVAzQ_Ag
90
+ - type: rouge
91
  value: 42.1317
92
+ name: ROUGE-LSUM
93
  verified: true
94
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMTRiNTEzZTFjYTljZmY4N2VhYWQ1MDNhNGViZDg3OTVhOTg5NDFlMmY1NzNmNWMwNTk3MTEwODY5NjQwYjVlMSIsInZlcnNpb24iOjF9.25YRNHi8K1JXnSUwmNs0VNcsFmjhFTMK9_FglOWYcs-_UeW44SURYyQlvdTSYJl0f4fBdf6TYe2nTEWJH0_oBA
95
+ - type: loss
96
  value: 1.6859958171844482
97
+ name: loss
98
  verified: true
99
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMjlkMDQ0NjYyMjBiYTVmMmUzMDQ4NzZmMjczZTdhNzY4NWI5Mzk0ZTc2NTM4NjE2ZjAzMDI4MWJkYTUyYjVjMyIsInZlcnNpb24iOjF9.ks0DwFnTsSu05CEJ6Wlm-41yVFyWdXzzAJlURdxjExPziPCWGXGEMVdZ07Nc4ANsKjlUD508Qyb3c_a-fIjEBA
100
+ - type: gen_len
101
  value: 23.7103
102
+ name: gen_len
103
  verified: true
104
+ verifyToken: eyJhbGciOiJFZERTQSIsInR5cCI6IkpXVCJ9.eyJoYXNoIjoiMDg3YTNhNWJkZDZmN2Q2ZmY4MjhmZjNjYWViNDY5ODZiNTVhNjY1NTk2YzI1NjQ4ZDFjYzZkZDYxZmMwMGU5MCIsInZlcnNpb24iOjF9._unExkTq29yj3ZBx1XzLs38T-k294vGaq5bsTnpPDTx3mR6h1JN-hBepuRJUBdIr5jIsTsPfsMh_xlrQ3JzuAA
105
  ---
106
  ## `bart-large-xsum-samsum`
107
  This model was obtained by fine-tuning `facebook/bart-large-xsum` on [Samsum](https://huggingface.co/datasets/samsum) dataset.