AstroMLab
/

AstroSage-8B

Text Generation

Model card Files Files and versions Community

Tijmen2 commited on 8 days ago

Commit

2773778

•

1 Parent(s): 6311dac

Update README.md

Files changed (1) hide show

README.md +14 -1

README.md CHANGED Viewed

@@ -15,6 +15,8 @@ base_model:
 # AstroSage-Llama-3.1-8B
 AstroSage-Llama-3.1-8B is a domain-specialized natural-language AI assistant tailored for research in astronomy, astrophysics, and cosmology. Trained on the complete collection of astronomy-related arXiv papers from 2007-2024 along with millions of synthetically-generated question-answer pairs and other astronomical literature, AstroSage-Llama-3.1-8B demonstrates excellent proficiency on a wide range of questions. This achievement demonstrates the potential of domain specialization in AI, suggesting that focused training can yield capabilities exceeding those of much larger, general-purpose models.
 ## Model Details
@@ -133,4 +135,15 @@ While this model is designed for scientific use:
 - Corresponding author: Tijmen de Haan (tijmen dot dehaan at gmail dot com)
 - AstroMLab: astromachinelearninglab at gmail dot com
-- Please cite the AstroMLab 3 paper when referencing this model

 # AstroSage-Llama-3.1-8B
+https://arxiv.org/abs/2411.09012
 AstroSage-Llama-3.1-8B is a domain-specialized natural-language AI assistant tailored for research in astronomy, astrophysics, and cosmology. Trained on the complete collection of astronomy-related arXiv papers from 2007-2024 along with millions of synthetically-generated question-answer pairs and other astronomical literature, AstroSage-Llama-3.1-8B demonstrates excellent proficiency on a wide range of questions. This achievement demonstrates the potential of domain specialization in AI, suggesting that focused training can yield capabilities exceeding those of much larger, general-purpose models.
 ## Model Details
 - Corresponding author: Tijmen de Haan (tijmen dot dehaan at gmail dot com)
 - AstroMLab: astromachinelearninglab at gmail dot com
+- Please cite the AstroMLab 3 paper when referencing this model
+```
+@preprint{dehaan2024astromlab3,
+      title={AstroMLab 3: Achieving GPT-4o Level Performance in Astronomy with a Specialized 8B-Parameter Large Language Model},
+      author={Tijmen de Haan and Yuan-Sen Ting and Tirthankar Ghosal and Tuan Dung Nguyen and Alberto Accomazzi and Azton Wells and Nesar Ramachandra and Rui Pan and Zechang Sun},
+      year={2024},
+      eprint={2411.09012},
+      archivePrefix={arXiv},
+      primaryClass={astro-ph.IM},
+      url={https://arxiv.org/abs/2411.09012},
+}