Deci
/

Text Generation
Transformers
Safetensors
English
Deci AI
DeciLM
custom_code
Eval Results
File size: 6,420 Bytes
294d66f
8c912e1
255e99e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
294d66f
255e99e
 
b66233e
255e99e
 
 
 
a29bd8d
255e99e
 
 
 
7c06d7a
255e99e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
a8a435e
511f677
 
255e99e
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
385e3a2
255e99e
 
385e3a2
18f9b3d
255e99e
 
 
 
 
 
 
 
8699557
255e99e
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
---
license: [llama2, other]
datasets:
- cerebras/SlimPajama-627B
language:
- en
pipeline_tag: text-generation
tags:
- Deci AI
- DeciLM
model-index:
- name: DeciLM 6B
  results:
  - task:
      type: text-generation
    dataset:
      type: ai2/arc
      name: ai2_arc
    metrics:
    - name: ARC Challenge
      type: ARC Challenge
      value: 42.06
      verified: false
  - task:
      type: text-generation
    dataset:
      type: ai2/arc
      name: ai2_arc
    metrics:
    - name: ARC Easy
      type: ARC Easy
      value: 70.02
      verified: false
  - task:
      type: text-generation
    dataset:
      type: boolq
      name: boolq
    metrics:
    - name: BoolQ
      type: BoolQ
      value: 71.01
      verified: false
  - task:
      type: text-generation
    dataset:
      type: hellaswag
      name: hellaswag
    metrics:
    - name: HellaSwag
      type: HellaSwag
      value: 74.58
      verified: false
  - task:
      type: text-generation
    dataset:
      type: LAMBDA
      name: OpenAI LAMBDA
    metrics:
    - name: LAMBDA
      type: LAMBDA
      value: 69.78
      verified: false
  - task:
      type: text-generation
    dataset:
      type: OpenBookQA
      name: openbookqa
    metrics:
    - name: OpenBookQA
      type: OpenBookQA
      value: 34
      verified: false
  - task:
      type: text-generation
    dataset:
      type: PIQA
      name: piqa
    metrics:
    - name: PIQA
      type: PIQA
      value: 77.09
      verified: false
  - task:
      type: text-generation
    dataset:
      type: truthful_qa
      name: truthful_qa
    metrics:
    - name: TruthfulQA
      type: TruthfulQA
      value: 36.19
      verified: false
  - task:
      type: text-generation
    dataset:
      type: winogrande
      name: winogrande
    metrics:
    - name: Winogrande
      type: Winogrande
      value: 68.03
      verified: false
---
# DeciLM 6B

DeciLM 6B is a 5.7 billion parameter decoder-only text generation model. With a context window of 4096 tokens, the highly efficient model uses variable Grouped-Query Attention (GQA) to achieve an optimal balance between performance and computational efficiency. The model's architecture was generated using Deci's proprietary Neural Architecture Search-based technology, AutoNAC. 
## Model Details

### Model Description

Deci developed and publically released the DeciLM 6B large language model, a pretrained, high-efficiency generative text model with 5.7 billion parameters. DeciLM 6B outpaces pretrained models in its class, with a throughput that's up to 15 times that of Llama 2 7B's. DeciLM-6B was further fine-tuned using [LoRA ](https://arxiv.org/pdf/2106.09685.pdf)  for instruction following on a subset of the OpenOrca dataset, creating [DeciLM 6B-Instruct](https://huggingface.co/Deci/DeciLM-6b-instruct) 

- **Developed by:** Deci
- **Model type:** DeciLM is an auto-regressive language model using an optimized transformer decoder architecture that includes variable Grouped-Query Attention.
- **Language(s) (NLP):** English
- **License:**  [Llama 2 Community License Agreement](https://huggingface.co/Deci/DeciLM-6b/blob/main/LICENSE.md) with an extention of Deci regarding hosting service providers.

## Model Architecture

| Parameters | Layers | Heads  | Sequence Length  | GQA num_key_value_heads*  | Hidden Size  |
|:----------|:----------|:----------|:----------|:----------|:----------|
| 5.7B    | 32    | 32    | 4096   | Variable  | 4096 |  |

*AutoNAC was employed to optimize the selection of the GQA num_key_value_heads for each layer of the model.

- **Decoder layer:** Varible Grouped Query Attention. Grouped Query Attention (GQA) was introduced in [Ainslie et al., 2023](https://arxiv.org/abs/2305.13245)
- **Position Embeddings:** Dynamic NTK Scaling Rotary Position Embeddings [Su et al., 2021](https://arxiv.org/abs/2104.09864)


### Model Sources

- **Paper:** [DeciLM Technical Blog](https://deci.ai/blog/decilm-15-times-faster-than-llama2-nas-generated-llm-with-variable-gqa/?utm_campaign=repos&utm_source=hugging-face&utm_medium=model-card&utm_content=decilm-6b)
- **Demo:** [DeciLM 6B Instruct Demo](https://huggingface.co/spaces/Deci/DeciLM-6b-instruct)
- **Notebook:** [DeciLM 6B Notebook](https://colab.research.google.com/drive/1LugJCifOv0L426ukRHjOblBRWwUImAit)

## Uses

The model is intended for commercial and research use in English and can be fine-tuned for use in other languages.

## How to Get Started with the Model

Use the code below to get started with the model.

```bibtex
# pip install -q transformers

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

checkpoint = "Deci/DeciLM-6b"
device = "cuda" # for GPU usage or "cpu" for CPU usage

tokenizer = AutoTokenizer.from_pretrained(checkpoint)
model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, trust_remote_code=True).to(device)

inputs = tokenizer.encode("In a shocking finding, scientists discovered a herd of unicorns living in", return_tensors="pt").to(device)
outputs = model.generate(inputs, max_new_tokens=100, do_sample=True, top_p=0.95)
print(tokenizer.decode(outputs[0]))
```

## Training Details

DeciLM 6B underwent training utilizing a subset of the SlimPajamas dataset, leveraging advanced proprietary methodologies allowing for fast training.

## Evaluation

Below are DeciLM's 6B evaluation results.

| Average | ARC Challenge* | ARC Easy* | BoolQ | HellaSwag* | LAMBDA OpenAI | OpenBookQA | PIQA | TruthfulQA | Winogrande |
|:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|
| 60.33    | 42.06    | 70.02    | 71.01    | 74.58    | 69.78    | 34    | 77.09    |36.19    | 68.03    | 
Accuracy-norm score*


### Runtime Benchmarks

|Inference Tool/Hardware | A10 (tokens/sec) |
|:----------|:----------|
| PyTorch  | 652.49 | 
| Infery LLM | 2,029.6  | 

- Throughput (tokens/sec) - Measured with optimal batch - PyTorch BS 64, Infery LLM BS 128
- In order to replicate the results of the PyTorch benchmark, use this [code example](https://huggingface.co/Deci/DeciLM-6b/blob/main/hf_benchmark_example.py)


## How to Cite

Please cite this model using this format.

```bibtex
@misc{DeciFoundationModels,
title = {DeciLM 6B},
author = {DeciAI Research Team},
year = {2023}
url={[https://huggingface.co/Deci/DeciLM-6b](https://huggingface.co/Deci/DeciLM-6b)},
}
```