Deci
/

Text Generation
Transformers
Safetensors
English
deci
Deci AI
DeciLM
custom_code
Eval Results
OferB commited on
Commit
255e99e
1 Parent(s): 2365f8d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +195 -0
README.md CHANGED
@@ -1,3 +1,198 @@
1
  ---
2
  license: llama2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: llama2
3
+ datasets:
4
+ - cerebras/SlimPajama-627B
5
+ language:
6
+ - en
7
+ pipeline_tag: text-generation
8
+ tags:
9
+ - Deci AI
10
+ - DeciLM
11
+ model-index:
12
+ - name: DeciLM 6B
13
+ results:
14
+ - task:
15
+ type: text-generation
16
+ dataset:
17
+ type: ai2/arc
18
+ name: ai2_arc
19
+ metrics:
20
+ - name: ARC Challenge
21
+ type: ARC Challenge
22
+ value: 42.06
23
+ verified: false
24
+ - task:
25
+ type: text-generation
26
+ dataset:
27
+ type: ai2/arc
28
+ name: ai2_arc
29
+ metrics:
30
+ - name: ARC Easy
31
+ type: ARC Easy
32
+ value: 70.02
33
+ verified: false
34
+ - task:
35
+ type: text-generation
36
+ dataset:
37
+ type: boolq
38
+ name: boolq
39
+ metrics:
40
+ - name: BoolQ
41
+ type: BoolQ
42
+ value: 71.01
43
+ verified: false
44
+ - task:
45
+ type: text-generation
46
+ dataset:
47
+ type: hellaswag
48
+ name: hellaswag
49
+ metrics:
50
+ - name: HellaSwag
51
+ type: HellaSwag
52
+ value: 74.58
53
+ verified: false
54
+ - task:
55
+ type: text-generation
56
+ dataset:
57
+ type: LAMBDA
58
+ name: OpenAI LAMBDA
59
+ metrics:
60
+ - name: LAMBDA
61
+ type: LAMBDA
62
+ value: 69.78
63
+ verified: false
64
+ - task:
65
+ type: text-generation
66
+ dataset:
67
+ type: OpenBookQA
68
+ name: openbookqa
69
+ metrics:
70
+ - name: OpenBookQA
71
+ type: OpenBookQA
72
+ value: 34
73
+ verified: false
74
+ - task:
75
+ type: text-generation
76
+ dataset:
77
+ type: PIQA
78
+ name: piqa
79
+ metrics:
80
+ - name: PIQA
81
+ type: PIQA
82
+ value: 77.09
83
+ verified: false
84
+ - task:
85
+ type: text-generation
86
+ dataset:
87
+ type: truthful_qa
88
+ name: truthful_qa
89
+ metrics:
90
+ - name: TruthfulQA
91
+ type: TruthfulQA
92
+ value: 36.19
93
+ verified: false
94
+ - task:
95
+ type: text-generation
96
+ dataset:
97
+ type: winogrande
98
+ name: winogrande
99
+ metrics:
100
+ - name: Winogrande
101
+ type: Winogrande
102
+ value: 68.03
103
+ verified: false
104
  ---
105
+ # DeciLM 6B
106
+
107
+ DeciLM 6B is a 5.7 billion parameter decoder-only text generation model. With a context window of 4096 tokens, the highly efficient model uses variable Grouped-Query Attention (GQA) to achieve an optimal balance between performance and computational efficiency. The model's architecture was generated using Deci's proprietary Neural Architecture Search-based technology, AutoNAC. DeciLM 6B underwent training utilizing the SlimPajamas dataset, leveraging advanced proprietary methodologies allowing for fast training.
108
+
109
+ ## Model Details
110
+
111
+ ### Model Description
112
+
113
+ Deci developed and publically released the DeciLM 6B large language model (LLM), a pretrained, high-efficiency generative text model with 5.7 billion parameters. DeciLM 6B outpaces pretrained models in its class, with a throughput that's up to 15 times that of Llama 2 7B's. DeciLM-6B was further LoRA fine-tuned for instruction following on a subset of the OpenOrca dataset, creating DeciLM 6B Instruct
114
+
115
+ - **Developed by:** Deci
116
+ - **Model type:** DeciLM is an auto-regressive language model using an optimized transformer decoder architecture that includes variable Grouped-Query Attention.
117
+ - **Language(s) (NLP):** English
118
+ - **License:** [Llama 2 Community License Agreement](https://huggingface.co/Deci/DeciLM-6b/blob/main/LICENSE.md)
119
+
120
+ ## Model Architecture
121
+
122
+ | Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads* | Hidden Size |
123
+ |:----------|:----------|:----------|:----------|:----------|:----------|
124
+ | 5.7B | 32 | 32 | 4096 | Variable | 4096 | |
125
+
126
+ *AutoNAC was employed to optimize the selection of the GQA num_key_value_heads for each layer of the model.
127
+
128
+ - **Decoder layer:** Varible Grouped Query Attention. Grouped Query Attention (GQA) was introduced in [Ainslie et al., 2023](https://arxiv.org/abs/2305.13245)
129
+ - **Position Embeddings:** Dynamic NTK Scaling Rotary Position Embeddings [Su et al., 2021](https://arxiv.org/abs/2104.09864)
130
+
131
+
132
+ ### Model Sources
133
+
134
+ - **Paper:** [DeciLM Technical Blog](https://deci.ai/blog/decilm-15-times-faster-than-llama2-nas-generated-llm-with-variable-gqa/)
135
+ - **Demo:** [DeciLM 6B Instruct Demo](https://huggingface.co/spaces/Deci/DeciLM-6b-instruct-Demo/)
136
+ - **Notebook:** [DeciLM 6B Notebook](https://huggingface.co/Deci/DeciLM-6b-instruct)
137
+
138
+ ## Uses
139
+
140
+ The model is intended for commercial and research use in English and can be fine-tuned for use in other languages.
141
+
142
+ ## How to Get Started with the Model
143
+
144
+ Use the code below to get started with the model.
145
+
146
+ ```bibtex
147
+ # pip install -q transformers
148
+
149
+ import torch
150
+ from transformers import AutoModelForCausalLM, AutoTokenizer
151
+
152
+ checkpoint = "Deci/DeciLM-6b"
153
+ device = "cuda" # for GPU usage or "cpu" for CPU usage
154
+
155
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
156
+ model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, trust_remote_code=True).to(device)
157
+
158
+ inputs = tokenizer.encode("In a shocking finding, scientists discovered a herd of unicorns living in", return_tensors="pt").to(device)
159
+ outputs = model.generate(inputs, max_new_tokens=100, do_sample=True, top_p=0.95)
160
+ print(tokenizer.decode(outputs[0]))
161
+ ```
162
+
163
+ ## Training Details
164
+
165
+ DeciLM 6B underwent training utilizing a subset of the SlimPajamas dataset, leveraging advanced proprietary methodologies allowing for fast training.
166
+
167
+ ## Evaluation
168
+
169
+ Below are DeciLM's 6B evaluation results.
170
+
171
+ | Average | ARC Challenge* | ARC Easy* | BoolQ | HellaSwag* | LAMBDA OpenAI | OpenBookQA | PIQA | TruthfulQA | Winogrande |
172
+ |:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|
173
+ | 60.33 | 42.06 | 70.02 | 71.01 | 74.58 | 69.78 | 34 | 77.09 |36.19 | 68.03 |
174
+ Accuracy-norm score*
175
+
176
+
177
+ ### Runtime Benchmarks
178
+
179
+ |Inference Tool/Hardware | A10 (tokens/sec) |
180
+ |:----------|:----------|
181
+ | HF Inference Endpoints | 652.49 |
182
+ | Infery LLM | 2,029.6 |
183
+
184
+ - Throughput (tokens/sec) - Measured with optimal batch - HF Inference Endpoints BS 64, Infery LLM BS 128
185
+
186
+
187
+ ## How to Cite
188
+
189
+ Please cite this model using this format.
190
+
191
+ ```bibtex
192
+ @misc{DeciFoundationModels,
193
+ title = {DeciLM},
194
+ author = {DeciAI Research Team},
195
+ year = {2023}
196
+ url={[https://huggingface.co/Deci/DeciLM-6b](https://huggingface.co/Deci/DeciLM-6b)},
197
+ }
198
+ ```