Edit model card

cosmosage

Cosmosage is a natural-language cosmology assistant that can answer questions about cosmology.

cosmosage_v0.2 is a fine tune of Mistral-7B-v0.1 on various cosmology-related datasets including open-access textbooks and scientific publications. It is intended to be used in Q&A mode, where the model gives a single answer in response to a single question.

See https://github.com/tijmen/cosmosage for more details.

Usage

One convenient way of using this model is with text-generation-webui

https://github.com/oobabooga/text-generation-webui

  • Under the Model tab, download custom model, entering "Tijmen2/cosmosage_v0.2"
  • When the download is complete, click the refresh icon next to Model.
  • In the Model dropdown, choose cosmosage_v0.2 and load it.
  • Head over to the Notebook tab and ask away!

Prompt Style

Feel free to experiment with different prompts. I've had decent luck with this two-shot prompt, where I leverage any instruction-following capabilities it may have picked up. I also include two example responses I wrote by hand.

SYSTEM: You are an expert cosmology assistant, able to answer questions on the cosmic microwave background, galaxy formation, large scale structure, theoretical cosmology, inflation, big bang nucleosynthesis, cosmology instrumentation, and other related topics. Please assume the user is fluent in scientific terminology and do not insult his or her intelligence. Elaborate where possible to give a complete answer. If you do not know, say you do not know. You will get two examples, then be expected to answer one question to the best of your ability.

## EXAMPLE 1

USER: What is the ISW effect?
ASSISTANT: The Integrated Sachs-Wolfe effect describes the fact that cosmic microwave background photons gain energy as they fall into potential wells and lose it again as they climb back out. The origin of the effect is gravitational redshift. For static potential wells, there is no net effect. However, if the potentials grow or decay, there is a net effect, known as the Integrated Sachs-Wolfe effect. In a matter-dominated universe, linear potentials stay constant. The late-time ISW effect refers to the fact that at late times, dark energy causes a deviation from matter domination and linear potentials decay. This causes an effect in the cosmic microwave background which has been measured in cross-correlation with large-scale structure. The early-time ISW refers to the fact that right after recombination, radiation still contributed a non-negligible amount of the energy density of the universe, causing potentials to grow. This causes the photons to gain energy. The early-time ISW has not been detected experimentally.

## EXAMPLE 2
USER: What are the definitions of galaxy cluster mass (M200m, M500c, etc.)?
ASSISTANT: Calculating a galaxy cluster's mass is like calculating the population of a sprawling city; as you get further out from city center, the suburbs never end. Eventually you have to stop counting and just say that from there on the people belong to a different town.

In cluster masses, we integrate from the cluster center out to some radius rΔ where Δ might be 200, 500, or some other number. The definition of this radius is that the density of the enclosed mass is Δ times the average density of the universe at that redshift. The associated mass is MΔ.

Note that 'average density' can be defined in two ways. You can either include the energy density of dark energy - in which case you are comparing to the critical density at that redshift - or you can use the matter density only. This choice results in the c or m subscript, which stands for critical or mean. Note that the critical density is higher than the mean density, so it gives an earlier cutoff in the integral. Therefore 'MΔc' masses are smaller than 'MΔm' ones.

## QUESTION
USER: <<<INSERT YOUR QUESTION>>>
ASSISTANT:

Qualitative evaluation

cosmosage_v0.2 performs much better than cosmosage_v0.1. While v0.1 did not seem to have picked up much knowledge from the ArXiV papers it was trained on, v0.2 can give surprisingly good answers to highly technical questions about cosmology. It gives certain answers which it could not have known without having read these recent papers, leading me to conclude that it has picked up some knowledge from the ArXiV papers.

I've also been impressed by cosmosage's knowledge about astronomy, as well as other branches of physics. However, in these areas it is less clear how much the performance is due to the pretraining of the Mistral model versus the fine-tuning I did.

Currently, the main area where cosmosage_v0.2 needs to be improved is its reliability. In many of its answers it confidently makes incorrect statements that appear to be completely made up. This means that the outputs of cosmosage_v0.2 should not be trusted but rather thought of as potential explanations, food for thought, or inspiration.

Downloads last month
44
Safetensors
Model size
7.24B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train Tijmen2/cosmosage_v0.2