haining commited on
Commit
a0f6d39
1 Parent(s): bcdb246

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +15 -13
README.md CHANGED
@@ -4,7 +4,7 @@ inference:
4
  do_sample: true
5
  max_length: 512
6
  top_p: 0.9
7
- repetition_penalty: 1.2
8
  language:
9
  - en
10
  license: mit
@@ -143,19 +143,21 @@ Implementations of SacreBLEU, BERT Score, ROUGLE, METEOR, and SARI are from Hugg
143
 
144
  ## Results
145
 
146
- TBA.
 
 
147
 
148
- <!-- | Metrics | SAS-baseline |
149
  |----------------|-------------------|
150
- | SacreBLEU↑ | 20.97 |
151
- | BERT Score F1↑ | 0.89 |
152
- | ROUGLE-1↑ | 0.48 |
153
- | ROUGLE-2↑ | 0.23 |
154
- | ROUGLE-L↑ | 0.32 |
155
- | METEOR↑ | 0.39 |
156
- | SARI↑ | 46.83 |
157
- | ARI↓* | 17.12 (std. 1.97) |
158
- * Note: Half of the generated texts are too short (less than 100 words) to calcualte meaningful ARI. We therefore concatenated adjecent two texts and compute ARI for the 100 texts (instead of original 200 texts). -->
159
 
160
 
161
  # Contact
@@ -164,7 +166,7 @@ Please [contact us](mailto:hw56@indiana.edu) for any questions or suggestions.
164
 
165
  # Disclaimer
166
 
167
- The model (scientific_abstract_simplification) is created for making scientific abstracts more accessible. Its outputs should not be used or trusted outside of its scope. There is no guarantee that the generated text is perfectly aligned with the research. Resort to human experts or original papers when a decision is critical.
168
 
169
 
170
  # Acknowledgement
 
4
  do_sample: true
5
  max_length: 512
6
  top_p: 0.9
7
+ repetition_penalty: 1.0
8
  language:
9
  - en
10
  license: mit
 
143
 
144
  ## Results
145
 
146
+ We tested our model on the SAS test set (200 samples). We generate 10 lay summaries based on each sample's abstract. During generation, we used top-p sampling with $p=0.9$.
147
+ The mean performance is reported below.
148
+
149
 
150
+ | Metrics | SAS |
151
  |----------------|-------------------|
152
+ | SacreBLEU↑ | 25.60 |
153
+ | BERT Score F1↑ | 90.14 |
154
+ | ROUGLE-1↑ | 52.28 |
155
+ | ROUGLE-2↑ | 29.61 |
156
+ | ROUGLE-L↑ | 38.02 |
157
+ | METEOR↑ | 43.75 |
158
+ | SARI↑ | 51.96 |
159
+ | ARI| 17.04 |
160
+ Note: 1. Some generated texts are too short (less than 100 words) to calcualte meaningful ARI. We therefore concatenated adjecent five texts and compute ARI for the 400 longer texts (instead of original 2,000 texts). 2. BERT score, ROUGE, and METEOR multiplied by 100.
161
 
162
 
163
  # Contact
 
166
 
167
  # Disclaimer
168
 
169
+ This model is created for making scientific abstracts more accessible. Its outputs should not be used or trusted outside of its scope. There is no guarantee that the generated text is perfectly aligned with the research. Resort to human experts or original papers when a decision is critical.
170
 
171
 
172
  # Acknowledgement