AstroMLab
/

astrollama-2-70b-chat_aic

Text Generation

text-generation-inference

Model card Files Files and versions Community

tingyuansen commited on Nov 16, 2024

Commit

ef0f324

·

verified ·

1 Parent(s): 13eec42

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -35,7 +35,7 @@ AstroLLaMA-2-70B-Chat_AIC is a specialized chat model for astronomy, developed b
   - Warmup ratio: 0.03
   - Cosine decay schedule for learning rate reduction
 - **Primary Use**: Instruction-following and chat-based interactions for astronomy-related queries
-- **Reference**: Pan et al. 2024 [Link to be added]
 ## Using the model for chat
@@ -84,6 +84,7 @@ While the AstroLLaMA-2-70B-Base_AIC model demonstrated significant improvements
 | Model | Score (%) |
 |-------|-----------|
 | **<span style="color:green">AstroLLaMA-2-70B-Base (AstroMLab)</span>** | **<span style="color:green">76.0</span>** |
 | LLaMA-3.1-8B | 73.7 |
 | LLaMA-2-70B | 70.7 |
@@ -105,7 +106,7 @@ These limitations underscore the challenges in developing specialized chat model
 This model is released primarily for reproducibility purposes, allowing researchers to track the development process and compare different iterations of AstroLLaMA models.
-For optimal performance and the most up-to-date capabilities in astronomy-related tasks, we recommend using AstroSage-8B, where these limitations have been addressed through expanded training data and refined fine-tuning processes.
 ## Ethical Considerations

   - Warmup ratio: 0.03
   - Cosine decay schedule for learning rate reduction
 - **Primary Use**: Instruction-following and chat-based interactions for astronomy-related queries
+- **Reference**: [Pan et al. 2024](https://arxiv.org/abs/2409.19750)
 ## Using the model for chat
 | Model | Score (%) |
 |-------|-----------|
+| **AstroSage-LLaMA-3.1-8B (AstroMLab)** | **80.9** |
 | **<span style="color:green">AstroLLaMA-2-70B-Base (AstroMLab)</span>** | **<span style="color:green">76.0</span>** |
 | LLaMA-3.1-8B | 73.7 |
 | LLaMA-2-70B | 70.7 |
 This model is released primarily for reproducibility purposes, allowing researchers to track the development process and compare different iterations of AstroLLaMA models.
+For optimal performance and the most up-to-date capabilities in astronomy-related tasks, we recommend using AstroSage-LLaMA-3.1-8B, where these limitations have been addressed through expanded training data and refined fine-tuning processes.
 ## Ethical Considerations