Joseph717171
/

Hermes-2-Pro-Mistral-10.7B

Model card Files Files and versions Community

Joseph717171 commited on Mar 31

Commit

57965de

•

1 Parent(s): 31c6e64

Update README.md

Files changed (1) hide show

README.md +3 -1

README.md CHANGED Viewed

@@ -35,10 +35,12 @@ widget:
 # Credit for the model card's description goes to ddh0, mergekit, and NousResearch
 # Hermes-2-Pro-Mistral-10.7B
-This is Mistral-12.25B-Instruct-v0.2, a depth-upscaled version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
 This model is intended to be used as a basis for further fine-tuning, or as a drop-in upgrade from the original 7 billion parameter model.
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ## Merge Details

 # Credit for the model card's description goes to ddh0, mergekit, and NousResearch
 # Hermes-2-Pro-Mistral-10.7B
+This is Mistral-12.25B-Instruct-v0.2, a depth-upscaled version of [NousResearch/Hermes-2-Pro-Mistral-7B](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B).
 This model is intended to be used as a basis for further fine-tuning, or as a drop-in upgrade from the original 7 billion parameter model.
+Paper detailing how Depth-Up Scaling works:  [SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling](https://arxiv.org/abs/2312.15166)
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ## Merge Details