Joseph717171 commited on
Commit
57965de
1 Parent(s): 31c6e64

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -35,10 +35,12 @@ widget:
35
  # Credit for the model card's description goes to ddh0, mergekit, and NousResearch
36
  # Hermes-2-Pro-Mistral-10.7B
37
 
38
- This is Mistral-12.25B-Instruct-v0.2, a depth-upscaled version of [mistralai/Mistral-7B-Instruct-v0.2](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.2).
39
 
40
  This model is intended to be used as a basis for further fine-tuning, or as a drop-in upgrade from the original 7 billion parameter model.
41
 
 
 
42
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
43
 
44
  ## Merge Details
 
35
  # Credit for the model card's description goes to ddh0, mergekit, and NousResearch
36
  # Hermes-2-Pro-Mistral-10.7B
37
 
38
+ This is Mistral-12.25B-Instruct-v0.2, a depth-upscaled version of [NousResearch/Hermes-2-Pro-Mistral-7B](https://huggingface.co/NousResearch/Hermes-2-Pro-Mistral-7B).
39
 
40
  This model is intended to be used as a basis for further fine-tuning, or as a drop-in upgrade from the original 7 billion parameter model.
41
 
42
+ Paper detailing how Depth-Up Scaling works: [SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling](https://arxiv.org/abs/2312.15166)
43
+
44
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
45
 
46
  ## Merge Details