haining commited on
Commit
ca14f54
1 Parent(s): 5c8f6fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -66,7 +66,7 @@ As an ongoing effort, we are working on re-contextualizating abstracts for bette
66
  - **Model type:** Language model
67
  - **Developed by:**
68
  - PIs: Jason Clark and Hannah McKelvey
69
- - Fellows: Haining Wang and Deanna Zarrillo
70
  - [LEADING](https://cci.drexel.edu/mrc/leading/) Montana State University Library, Project "TL;DR it": Automating Article Synopses for Search Engine Optimization and Citizen Science
71
  - **Language(s) (NLP):** English
72
  - **License:** MIT
@@ -120,7 +120,7 @@ For SAS-baseline, we finetuned Flan-T5 model with the Scientific Abstract-Signif
120
  ## Setup
121
 
122
  We finetuned the base model with a standard language modeling objective: the abstracts are sources and the significance statements are targets. We inform the model with a task-spcific prefix ("summarize, simplify, and contextualize: ") during training. The training took roughly 9 hours on two NVIDIA RTX A5000 (24GB memory each) GPUs. We saved the checkpoint with the lowest validation loss for inference. We used the AdamW optimizer and a learning rate of 3e-5 with fully sharded data parallel strategy. The model (\~780M parameter) was trained on Nov. 20, 2022.
123
- Notice, the readability of the signifiance statements is generally lower than the abstracts', but not by a large margin. Our incoming SAS-full model will leverage more corpora for scientific (re)contexutualization, summarization, and simplification.
124
 
125
 
126
  # Evaluation
 
66
  - **Model type:** Language model
67
  - **Developed by:**
68
  - PIs: Jason Clark and Hannah McKelvey
69
+ - Fellow: Haining Wang
70
  - [LEADING](https://cci.drexel.edu/mrc/leading/) Montana State University Library, Project "TL;DR it": Automating Article Synopses for Search Engine Optimization and Citizen Science
71
  - **Language(s) (NLP):** English
72
  - **License:** MIT
 
120
  ## Setup
121
 
122
  We finetuned the base model with a standard language modeling objective: the abstracts are sources and the significance statements are targets. We inform the model with a task-spcific prefix ("summarize, simplify, and contextualize: ") during training. The training took roughly 9 hours on two NVIDIA RTX A5000 (24GB memory each) GPUs. We saved the checkpoint with the lowest validation loss for inference. We used the AdamW optimizer and a learning rate of 3e-5 with fully sharded data parallel strategy. The model (\~780M parameter) was trained on Nov. 20, 2022.
123
+ Notice, the readability of the significance statements is generally lower than the abstracts', but not by a large margin. Our incoming SAS-full model will leverage more corpora for scientific (re)contextualization, summarization, and simplification.
124
 
125
 
126
  # Evaluation