Update README.md
Browse files
README.md
CHANGED
@@ -15,6 +15,8 @@ base_model:
|
|
15 |
|
16 |
# AstroSage-Llama-3.1-8B
|
17 |
|
|
|
|
|
18 |
AstroSage-Llama-3.1-8B is a domain-specialized natural-language AI assistant tailored for research in astronomy, astrophysics, and cosmology. Trained on the complete collection of astronomy-related arXiv papers from 2007-2024 along with millions of synthetically-generated question-answer pairs and other astronomical literature, AstroSage-Llama-3.1-8B demonstrates excellent proficiency on a wide range of questions. This achievement demonstrates the potential of domain specialization in AI, suggesting that focused training can yield capabilities exceeding those of much larger, general-purpose models.
|
19 |
|
20 |
## Model Details
|
@@ -133,4 +135,15 @@ While this model is designed for scientific use:
|
|
133 |
|
134 |
- Corresponding author: Tijmen de Haan (tijmen dot dehaan at gmail dot com)
|
135 |
- AstroMLab: astromachinelearninglab at gmail dot com
|
136 |
-
- Please cite the AstroMLab 3 paper when referencing this model
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
15 |
|
16 |
# AstroSage-Llama-3.1-8B
|
17 |
|
18 |
+
https://arxiv.org/abs/2411.09012
|
19 |
+
|
20 |
AstroSage-Llama-3.1-8B is a domain-specialized natural-language AI assistant tailored for research in astronomy, astrophysics, and cosmology. Trained on the complete collection of astronomy-related arXiv papers from 2007-2024 along with millions of synthetically-generated question-answer pairs and other astronomical literature, AstroSage-Llama-3.1-8B demonstrates excellent proficiency on a wide range of questions. This achievement demonstrates the potential of domain specialization in AI, suggesting that focused training can yield capabilities exceeding those of much larger, general-purpose models.
|
21 |
|
22 |
## Model Details
|
|
|
135 |
|
136 |
- Corresponding author: Tijmen de Haan (tijmen dot dehaan at gmail dot com)
|
137 |
- AstroMLab: astromachinelearninglab at gmail dot com
|
138 |
+
- Please cite the AstroMLab 3 paper when referencing this model
|
139 |
+
|
140 |
+
```
|
141 |
+
@preprint{dehaan2024astromlab3,
|
142 |
+
title={AstroMLab 3: Achieving GPT-4o Level Performance in Astronomy with a Specialized 8B-Parameter Large Language Model},
|
143 |
+
author={Tijmen de Haan and Yuan-Sen Ting and Tirthankar Ghosal and Tuan Dung Nguyen and Alberto Accomazzi and Azton Wells and Nesar Ramachandra and Rui Pan and Zechang Sun},
|
144 |
+
year={2024},
|
145 |
+
eprint={2411.09012},
|
146 |
+
archivePrefix={arXiv},
|
147 |
+
primaryClass={astro-ph.IM},
|
148 |
+
url={https://arxiv.org/abs/2411.09012},
|
149 |
+
}
|