Tijmen2 commited on
Commit
2773778
1 Parent(s): 6311dac

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -15,6 +15,8 @@ base_model:
15
 
16
  # AstroSage-Llama-3.1-8B
17
 
 
 
18
  AstroSage-Llama-3.1-8B is a domain-specialized natural-language AI assistant tailored for research in astronomy, astrophysics, and cosmology. Trained on the complete collection of astronomy-related arXiv papers from 2007-2024 along with millions of synthetically-generated question-answer pairs and other astronomical literature, AstroSage-Llama-3.1-8B demonstrates excellent proficiency on a wide range of questions. This achievement demonstrates the potential of domain specialization in AI, suggesting that focused training can yield capabilities exceeding those of much larger, general-purpose models.
19
 
20
  ## Model Details
@@ -133,4 +135,15 @@ While this model is designed for scientific use:
133
 
134
  - Corresponding author: Tijmen de Haan (tijmen dot dehaan at gmail dot com)
135
  - AstroMLab: astromachinelearninglab at gmail dot com
136
- - Please cite the AstroMLab 3 paper when referencing this model
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  # AstroSage-Llama-3.1-8B
17
 
18
+ https://arxiv.org/abs/2411.09012
19
+
20
  AstroSage-Llama-3.1-8B is a domain-specialized natural-language AI assistant tailored for research in astronomy, astrophysics, and cosmology. Trained on the complete collection of astronomy-related arXiv papers from 2007-2024 along with millions of synthetically-generated question-answer pairs and other astronomical literature, AstroSage-Llama-3.1-8B demonstrates excellent proficiency on a wide range of questions. This achievement demonstrates the potential of domain specialization in AI, suggesting that focused training can yield capabilities exceeding those of much larger, general-purpose models.
21
 
22
  ## Model Details
 
135
 
136
  - Corresponding author: Tijmen de Haan (tijmen dot dehaan at gmail dot com)
137
  - AstroMLab: astromachinelearninglab at gmail dot com
138
+ - Please cite the AstroMLab 3 paper when referencing this model
139
+
140
+ ```
141
+ @preprint{dehaan2024astromlab3,
142
+ title={AstroMLab 3: Achieving GPT-4o Level Performance in Astronomy with a Specialized 8B-Parameter Large Language Model},
143
+ author={Tijmen de Haan and Yuan-Sen Ting and Tirthankar Ghosal and Tuan Dung Nguyen and Alberto Accomazzi and Azton Wells and Nesar Ramachandra and Rui Pan and Zechang Sun},
144
+ year={2024},
145
+ eprint={2411.09012},
146
+ archivePrefix={arXiv},
147
+ primaryClass={astro-ph.IM},
148
+ url={https://arxiv.org/abs/2411.09012},
149
+ }