ashishdatta commited on
Commit
38e8060
1 Parent(s): f16df1b
Files changed (1) hide show
  1. README.md +120 -2
README.md CHANGED
@@ -1,5 +1,123 @@
1
  ---
 
 
2
  license: other
3
- license_name: stablelm-nc-community
4
- license_link: LICENSE
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
  license: other
5
+ tags:
6
+ - causal-lm
7
+ datasets:
8
+ - HuggingFaceH4/ultrachat_200k
9
+ - allenai/ultrafeedback_binarized_cleaned
10
+ - meta-math/MetaMathQA
11
+ - WizardLM/WizardLM_evol_instruct_V2_196k
12
+ - openchat/openchat_sharegpt4_dataset
13
+ - LDJnr/Capybara
14
+ - Intel/orca_dpo_pairs
15
+ - hkust-nlp/deita-10k-v0
16
+ - Anthropic/hh-rlhf
17
+ - glaiveai/glaive-function-calling-v2
18
+ extra_gated_fields:
19
+ Name: text
20
+ Email: text
21
+ Country: text
22
+ Organization or Affiliation: text
23
+ I ALLOW Stability AI to email me about new model releases: checkbox
24
  ---
25
+ # `StableLM 2 12B Chat`
26
+
27
+ ## Model Description
28
+
29
+ `Stable LM 2 12B Chat` is a 12 billion parameter instruction tuned language model trained on a mix of publicly available datasets and synthetic datasets, utilizing [Direct Preference Optimization (DPO)](https://arxiv.org/abs/2305.18290).
30
+ GGUF files were generated with [b2684](https://github.com/ggerganov/llama.cpp/releases/tag/b2684) release
31
+
32
+ ## Usage
33
+
34
+ `StableLM 2 12B Chat` uses the following instruction ChatML format.
35
+
36
+ ```bash
37
+ ./main -m stablelm-2-12b-q4_k_m.gguf -p "Implement snake game using pygame"
38
+ ```
39
+
40
+ ## Model Details
41
+
42
+ * **Developed by**: [Stability AI](https://stability.ai/)
43
+ * **Model type**: `StableLM 2 12B Chat` model is an auto-regressive language model based on the transformer decoder architecture.
44
+ * **Language(s)**: English
45
+ * **Paper**: [Stable LM 2 Chat Technical Report]((https://arxiv.org/abs/2402.17834)
46
+ * **Library**: [Alignment Handbook](https://github.com/huggingface/alignment-handbook.git)
47
+ * **Finetuned from model**:
48
+ * **License**: [StabilityAI Non-Commercial Research Community License](https://huggingface.co/stabilityai/stablelm-2-zephyr-1_6b/blob/main/LICENSE). If you want to use this model for your commercial products or purposes, please contact us [here](https://stability.ai/contact) to learn more.
49
+ * **Contact**: For questions and comments about the model, please email `lm@stability.ai`.
50
+
51
+ ### Training Dataset
52
+
53
+ The dataset is comprised of a mixture of open datasets large-scale datasets available on the [HuggingFace Hub](https://huggingface.co/datasets) as well as an internal safety dataset:
54
+ 1. SFT Datasets
55
+ - HuggingFaceH4/ultrachat_200k
56
+ - meta-math/MetaMathQA
57
+ - WizardLM/WizardLM_evol_instruct_V2_196k
58
+ - Open-Orca/SlimOrca
59
+ - openchat/openchat_sharegpt4_dataset
60
+ - LDJnr/Capybara
61
+ - hkust-nlp/deita-10k-v0
62
+ - teknium/OpenHermes-2.5
63
+ - glaiveai/glaive-function-calling-v2
64
+
65
+ 2. Safety Datasets:
66
+ - Anthropic/hh-rlhf
67
+ - Internal Safety Dataset
68
+
69
+ 3. Preference Datasets:
70
+ - argilla/dpo-mix-7k
71
+
72
+ ## Performance
73
+
74
+ ### MT-Bench
75
+
76
+ | Model | Parameters | MT Bench (Inflection-corrected) |
77
+ |---------------------------------------|------------|---------------------------------|
78
+ | mistralai/Mixtral-8x7B-Instruct-v0.1 | 13B/47B | 8.48 ± 0.06 |
79
+ | stabilityai/stablelm-2-12b-chat | 12B | 8.15 ± 0.08 |
80
+ | Qwen/Qwen1.5-14B-Chat | 14B | 7.95 ± 0.10 |
81
+ | HuggingFaceH4/zephyr-7b-gemma-v0.1 | 8.5B | 7.82 ± 0.03 |
82
+ | mistralai/Mistral-7B-Instruct-v0.2 | 7B | 7.48 ± 0.02 |
83
+ | meta-llama/Llama-2-70b-chat-hf | 70B | 7.29 ± 0.05 |
84
+
85
+ ### OpenLLM Leaderboard
86
+
87
+ | Model | Parameters | Average | ARC Challenge (25-shot) | HellaSwag (10-shot) | MMLU (5-shot) | TruthfulQA (0-shot) | Winogrande (5-shot) | GSM8K (5-shot) |
88
+ | -------------------------------------- | ---------- | ------- | ---------------------- | ------------------- | ------------- | ------------------- | ------------------- | -------------- |
89
+ | mistralai/Mixtral-8x7B-Instruct-v0.1 | 13B/47B | 72.71 | 70.14 | 87.55 | 71.40 | 64.98 | 81.06 | 61.11 |
90
+ | stabilityai/stablelm-2-12b-chat | 12B | 68.45 | 65.02 | 86.06 | 61.14 | 62.00 | 78.77 | 57.70 |
91
+ | Qwen/Qwen1.5-14B | 14B | 66.70 | 56.57 | 81.08 | 69.36 | 52.06 | 73.48 | 67.63 |
92
+ | mistralai/Mistral-7B-Instruct-v0.2 | 7B | 65.71 | 63.14 | 84.88 | 60.78 | 60.26 | 77.19 | 40.03 |
93
+ | HuggingFaceH4/zephyr-7b-gemma-v0.1 | 8.5B | 62.41 | 58.45 | 83.48 | 60.68 | 52.07 | 74.19 | 45.56 |
94
+ | Qwen/Qwen1.5-14B-Chat | 14B | 62.37 | 58.79 | 82.33 | 68.52 | 60.38 | 73.32 | 30.86 |
95
+ | google/gemma-7b | 8.5B | 63.75 | 61.09 | 82.20 | 64.56 | 44.79 | 79.01 | 50.87 |
96
+ | stabilityai/stablelm-2-12b | 12B | 63.53 | 58.45 | 84.33 | 62.09 | 48.16 | 78.10 | 56.03 |
97
+ | mistralai/Mistral-7B-v0.1 | 7B | 60.97 | 59.98 | 83.31 | 64.16 | 42.15 | 78.37 | 37.83 |
98
+ | meta-llama/Llama-2-13b-hf | 13B | 55.69 | 59.39 | 82.13 | 55.77 | 37.38 | 76.64 | 22.82 |
99
+ | meta-llama/Llama-2-13b-chat-hf | 13B | 54.92 | 59.04 | 81.94 | 54.64 | 41.12 | 74.51 | 15.24 |
100
+
101
+ ## Use and Limitations
102
+
103
+ ### Intended Use
104
+
105
+ The model is intended to be used in chat-like applications. Developers must evaluate the model for safety performance in their specific use case. Read more about [safety and limitations](#limitations-and-bias) below.
106
+
107
+ ### Limitations and Bias
108
+
109
+ We strongly recommend pairing this model with an input and output classifier to prevent harmful responses.
110
+ Using this model will require guardrails around your inputs and outputs to ensure that any outputs returned are not hallucinations.
111
+ Additionally, as each use case is unique, we recommend running your own suite of tests to ensure proper performance of this model.
112
+ Finally, do not use the models if they are unsuitable for your application, or for any applications that may cause deliberate or unintentional harm to others.
113
+
114
+ ## How to Cite
115
+
116
+ ```
117
+ @article{bellagente2024stable,
118
+ title={Stable LM 2 1.6 B Technical Report},
119
+ author={Bellagente, Marco and Tow, Jonathan and Mahan, Dakota and Phung, Duy and Zhuravinskyi, Maksym and Adithyan, Reshinth and Baicoianu, James and Brooks, Ben and Cooper, Nathan and Datta, Ashish and others},
120
+ journal={arXiv preprint arXiv:2402.17834},
121
+ year={2024}
122
+ }
123
+ ```