haining
/

scientific_abstract_simplification

Text2Text Generation

text2text generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

haining commited on Dec 14, 2022

Commit

a0f6d39

•

1 Parent(s): bcdb246

Update README.md

Files changed (1) hide show

README.md +15 -13

README.md CHANGED Viewed

@@ -4,7 +4,7 @@ inference:
     do_sample: true
     max_length: 512
     top_p: 0.9
-    repetition_penalty: 1.2
 language:
   - en
 license: mit
@@ -143,19 +143,21 @@ Implementations of SacreBLEU, BERT Score, ROUGLE, METEOR, and SARI are from Hugg
 ## Results
-TBA.
-<!-- | Metrics        | SAS-baseline      |
 |----------------|-------------------|
-| SacreBLEU↑     | 20.97             |
-| BERT Score F1↑ | 0.89              |
-| ROUGLE-1↑      | 0.48              |
-| ROUGLE-2↑      | 0.23              |
-| ROUGLE-L↑      | 0.32              |
-| METEOR↑        | 0.39              |
-| SARI↑          | 46.83             |
-| ARI↓*          | 17.12 (std. 1.97) |
-* Note: Half of the generated texts are too short (less than 100 words) to calcualte meaningful ARI. We therefore concatenated adjecent two texts and compute ARI for the 100 texts (instead of original 200 texts). -->
 # Contact
@@ -164,7 +166,7 @@ Please [contact us](mailto:hw56@indiana.edu) for any questions or suggestions.
 # Disclaimer
-The model (scientific_abstract_simplification) is created for making scientific abstracts more accessible. Its outputs should not be used or trusted outside of its scope. There is no guarantee that the generated text is perfectly aligned with the research. Resort to human experts or original papers when a decision is critical.
 # Acknowledgement

     do_sample: true
     max_length: 512
     top_p: 0.9
+    repetition_penalty: 1.0
 language:
   - en
 license: mit
 ## Results
+We tested our model on the SAS test set (200 samples). We generate 10 lay summaries based on each sample's abstract. During generation, we used top-p sampling with $p=0.9$.
+The mean performance is reported below.
+| Metrics        | SAS               |
 |----------------|-------------------|
+| SacreBLEU↑     | 25.60             |
+| BERT Score F1↑ | 90.14             |
+| ROUGLE-1↑      | 52.28             |
+| ROUGLE-2↑      | 29.61             |
+| ROUGLE-L↑      | 38.02             |
+| METEOR↑        | 43.75             |
+| SARI↑          | 51.96             |
+| ARI↓           | 17.04             |
+Note: 1. Some generated texts are too short (less than 100 words) to calcualte meaningful ARI. We therefore concatenated adjecent five texts and compute ARI for the 400 longer texts (instead of original 2,000 texts). 2. BERT score, ROUGE, and METEOR multiplied by 100.
 # Contact
 # Disclaimer
+This model is created for making scientific abstracts more accessible. Its outputs should not be used or trusted outside of its scope. There is no guarantee that the generated text is perfectly aligned with the research. Resort to human experts or original papers when a decision is critical.
 # Acknowledgement