haining commited on
Commit
fdda21e
1 Parent(s): de3dcbc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -6
README.md CHANGED
@@ -101,14 +101,14 @@ For SAS-baseline, we finetuned Flan-T5 model with the Scientific Abstract-Signif
101
 
102
  | Scientific Abstract-Significance | # Training/Dev/Test Samples | # Training Tokens | # Validation Tokens | # Test Tokens | Automated Readability Index (std.) |
103
  |----------------------------------|-----------------------------|-------------------|---------------------|---------------|------------------------------------|
104
- | Abstract | 3030/200/200 | 707071 | 45697 | 46985 | 18.68 (2.85) |
105
- | Significance | 3030/200/200 | 375433 | 24901 | 24426 | 17.89 (3.05) |
106
 
107
 
108
 
109
  ## Setup
110
 
111
- We finetuned the base model with a standard language modeling objective: the abstracts are sources and the significance statements are targets. We inform the model with a task-spcific prefix ("summarize, simplify, and contextualize: ") during training. The training took roughly 9 hours on two Nvidia A5000 (24GB memory each) GPUs. We saved the checkpoint with the lowest validation loss for inference. We used the AdamW optimizer and a learning rate of 3e-5 with fully sharded data parallel strategy. The model (\~780M parameter) was trained on Nov. 20, 2022.
112
  Notice, the readability of the signifiance statements is generally lower than the abstracts', but not by a large margin. Our incoming SAS-full model will leverage more corpora for scientific (re)contexutualization, summarization, and simplification.
113
 
114
 
@@ -130,17 +130,27 @@ Implementations of sacreBLEU, BERT Score, ROUGLE, METEOR, and SARI are from Hugg
130
 
131
  ## Results
132
 
133
- TODO.
 
 
 
 
 
 
 
 
 
134
 
 
135
 
136
 
137
  # Contact
138
- The project is under active maintenance. Please [contact us](mailto:hw56@indiana.edu) for any questions or suggestions.
139
 
140
 
141
  # Disclaimer
142
 
143
- The model (SAS-baseline) is created for and focused on making scientific abstracts more accessible. It should not be used or trusted outside of its scope. There is **NO** guarantee that the generated text is perfectly aligned with the research. Resort to human experts or original papers when a decision is critical.
144
 
145
 
146
  # Acknowledgement
 
101
 
102
  | Scientific Abstract-Significance | # Training/Dev/Test Samples | # Training Tokens | # Validation Tokens | # Test Tokens | Automated Readability Index (std.) |
103
  |----------------------------------|-----------------------------|-------------------|---------------------|---------------|------------------------------------|
104
+ | Abstract | 3030/200/200 | 707,071 | 45,697 | 46,985 | 18.68 (2.85) |
105
+ | Significance | 3030/200/200 | 375,433 | 24,901 | 24,426 | 17.89 (3.05) |
106
 
107
 
108
 
109
  ## Setup
110
 
111
+ We finetuned the base model with a standard language modeling objective: the abstracts are sources and the significance statements are targets. We inform the model with a task-spcific prefix ("summarize, simplify, and contextualize: ") during training. The training took roughly 9 hours on two NVIDIA RTX A5000 (24GB memory each) GPUs. We saved the checkpoint with the lowest validation loss for inference. We used the AdamW optimizer and a learning rate of 3e-5 with fully sharded data parallel strategy. The model (\~780M parameter) was trained on Nov. 20, 2022.
112
  Notice, the readability of the signifiance statements is generally lower than the abstracts', but not by a large margin. Our incoming SAS-full model will leverage more corpora for scientific (re)contexutualization, summarization, and simplification.
113
 
114
 
 
130
 
131
  ## Results
132
 
133
+ | Metrics | SAS-baseline |
134
+ |----------------|--------------|
135
+ | sacreBLEU↑ | 20.97 |
136
+ | BERT Score F1↑ | 0.89 |
137
+ | ROUGLE-1↑ | 0.48 |
138
+ | ROUGLE-2↑ | 0.23 |
139
+ | ROUGLE-L↑ | 0.32 |
140
+ | METEOR↑ | 0.39 |
141
+ | SARI↑ | 46.83 |
142
+ | ARI↓* | 17.12 (1.97) |
143
 
144
+ * Note: Half of the generated texts are too short (less than 100 words) to calcualte meaningful ARI. We therefore concatenated adjecent two texts and compute ARI for the 100 texts (instead of original 200 texts).
145
 
146
 
147
  # Contact
148
+ Please [contact us](mailto:hw56@indiana.edu) for any questions or suggestions.
149
 
150
 
151
  # Disclaimer
152
 
153
+ The model (SAS-baseline) is created for making scientific abstracts more accessible. Its outputs should not be used or trusted outside of its scope. There is **NO** guarantee that the generated text is perfectly aligned with the research. Resort to human experts or original papers when a decision is critical.
154
 
155
 
156
  # Acknowledgement