tingyuansen
commited on
Commit
•
ef0f324
1
Parent(s):
13eec42
Update README.md
Browse files
README.md
CHANGED
@@ -35,7 +35,7 @@ AstroLLaMA-2-70B-Chat_AIC is a specialized chat model for astronomy, developed b
|
|
35 |
- Warmup ratio: 0.03
|
36 |
- Cosine decay schedule for learning rate reduction
|
37 |
- **Primary Use**: Instruction-following and chat-based interactions for astronomy-related queries
|
38 |
-
- **Reference**: Pan et al. 2024
|
39 |
|
40 |
## Using the model for chat
|
41 |
|
@@ -84,6 +84,7 @@ While the AstroLLaMA-2-70B-Base_AIC model demonstrated significant improvements
|
|
84 |
|
85 |
| Model | Score (%) |
|
86 |
|-------|-----------|
|
|
|
87 |
| **<span style="color:green">AstroLLaMA-2-70B-Base (AstroMLab)</span>** | **<span style="color:green">76.0</span>** |
|
88 |
| LLaMA-3.1-8B | 73.7 |
|
89 |
| LLaMA-2-70B | 70.7 |
|
@@ -105,7 +106,7 @@ These limitations underscore the challenges in developing specialized chat model
|
|
105 |
|
106 |
This model is released primarily for reproducibility purposes, allowing researchers to track the development process and compare different iterations of AstroLLaMA models.
|
107 |
|
108 |
-
For optimal performance and the most up-to-date capabilities in astronomy-related tasks, we recommend using AstroSage-8B, where these limitations have been addressed through expanded training data and refined fine-tuning processes.
|
109 |
|
110 |
## Ethical Considerations
|
111 |
|
|
|
35 |
- Warmup ratio: 0.03
|
36 |
- Cosine decay schedule for learning rate reduction
|
37 |
- **Primary Use**: Instruction-following and chat-based interactions for astronomy-related queries
|
38 |
+
- **Reference**: [Pan et al. 2024](https://arxiv.org/abs/2409.19750)
|
39 |
|
40 |
## Using the model for chat
|
41 |
|
|
|
84 |
|
85 |
| Model | Score (%) |
|
86 |
|-------|-----------|
|
87 |
+
| **AstroSage-LLaMA-3.1-8B (AstroMLab)** | **80.9** |
|
88 |
| **<span style="color:green">AstroLLaMA-2-70B-Base (AstroMLab)</span>** | **<span style="color:green">76.0</span>** |
|
89 |
| LLaMA-3.1-8B | 73.7 |
|
90 |
| LLaMA-2-70B | 70.7 |
|
|
|
106 |
|
107 |
This model is released primarily for reproducibility purposes, allowing researchers to track the development process and compare different iterations of AstroLLaMA models.
|
108 |
|
109 |
+
For optimal performance and the most up-to-date capabilities in astronomy-related tasks, we recommend using AstroSage-LLaMA-3.1-8B, where these limitations have been addressed through expanded training data and refined fine-tuning processes.
|
110 |
|
111 |
## Ethical Considerations
|
112 |
|