Deci
/

Text Generation
Transformers
Safetensors
deci
custom_code
NajeebDeci commited on
Commit
41064f3
1 Parent(s): 5198be0

Model Card

Browse files
Files changed (1) hide show
  1. README.md +166 -0
README.md ADDED
@@ -0,0 +1,166 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ pipeline_tag: text-generation
3
+ license: apache-2.0
4
+ tags:
5
+ - text generation
6
+ - Deci AI
7
+ - DeciCoder
8
+ programming_language:
9
+ - Java
10
+ - JavaScript
11
+ - Python
12
+ - Rust
13
+ - Go
14
+ - C++
15
+ - C
16
+ - C#
17
+ metrics:
18
+ - code_eval
19
+ inference: true
20
+ widget:
21
+ - text: 'def print_hello_world():'
22
+ example_title: Hello world
23
+ group: Python
24
+ model-index:
25
+ - name: DeciCoder-6b
26
+ results:
27
+ - task:
28
+ type: text-generation
29
+ dataset:
30
+ type: nuprl/MultiPL-E
31
+ name: MultiPL-HumanEval (Python)
32
+ metrics:
33
+ - name: pass@1
34
+ type: pass@1
35
+ value: 0.34
36
+ verified: false
37
+ - task:
38
+ type: text-generation
39
+ dataset:
40
+ type: nuprl/MultiPL-E
41
+ name: MultiPL-HumanEval (JavaScript)
42
+ metrics:
43
+ - name: pass@1
44
+ type: pass@1
45
+ value: 0.29
46
+ verified: false
47
+ - task:
48
+ type: text-generation
49
+ dataset:
50
+ type: nuprl/MultiPL-E
51
+ name: MultiPL-HumanEval (Java)
52
+ metrics:
53
+ - name: pass@1
54
+ type: pass@1
55
+ value: 0.30
56
+ verified: false
57
+ datasets:
58
+ - bigcode/starcoderdata
59
+ ---
60
+
61
+ # Model Card for DeciCoder 6B
62
+
63
+ DeciCoder 6B is a 6 billion parameter decoder-only code completion model
64
+ trained on the Python, Java, Javascript, Go, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata)..
65
+ The model uses variable Grouped Query Attention and has a context window of 4096
66
+ tokens. It was trained using a Fill-in-the-Middle training objective. The model's
67
+ architecture was generated by Deci's proprietary Neural Architecture
68
+ Search-based technology, AutoNAC.
69
+
70
+ ## Model Details
71
+
72
+ - **Developed by:** Deci
73
+ - **Model type:** DeciCoder is an auto-regressive language model based on the transformer decoder architecture, using variable Grouped Query Attention.
74
+ - **Language(s):** Python, Java, JavaScript, Go, Rust, C++, C, C#
75
+ - **License:** Model checkpoints are licensed under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
76
+
77
+ ## Model Architecture
78
+
79
+ | Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads | Hidden Size |
80
+ |:----------|:----------|:----------|:----------|:----------|:----------|
81
+ | 6B | 32 | 32 | 4096 | Variable | 4096 | |
82
+
83
+
84
+ - **Decoder layer:** Variable Grouped Query Attention. Grouped Query Attention was introduced in [Ainslie et al., 2023](https://arxiv.org/abs/2305.13245)
85
+ - **Position Embeddings:** Rotary Position Embeddings [Su et al., 2021](https://arxiv.org/abs/2104.09864)
86
+
87
+ ## Uses
88
+
89
+ The model is intended to do single/multiline code completion from a
90
+ context window of up to 4096k tokens. It is *not* an instruction model
91
+ and commands like \"Write a function that computes the absolute value of
92
+ an integer,\" won't yield the desired results. A more effective approach
93
+ is to frame instructions in the style of source code comments (e.g. \#
94
+ this function calculates the absolute value of an integer) or to present
95
+ a function signature and docstring, enabling the model to complete the
96
+ function's body.
97
+
98
+ ### How to Use
99
+
100
+ ```bibtex
101
+ # pip install -q transformers
102
+ import torch
103
+ from transformers import AutoModelForCausalLM, AutoTokenizer
104
+
105
+ checkpoint = "Deci/DeciCoder-6b"
106
+ device = "cuda" # for GPU usage or "cpu" for CPU usage
107
+
108
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
109
+ model = AutoModelForCausalLM.from_pretrained(checkpoint, torch_dtype=torch.bfloat16, trust_remote_code=True).to(device)
110
+
111
+ inputs = tokenizer.encode("def print_hello_world():", return_tensors="pt").to(device)
112
+ outputs = model.generate(inputs, max_new_tokens=100)
113
+ print(tokenizer.decode(outputs[0]))
114
+
115
+ ### Attribution
116
+
117
+ DeciCoder was trained on StarCoder Training Dataset, filtered for
118
+ Python, Java, JavaScript, Rust, Go, C++, C, and C#. For additional information, please
119
+ refer to [https://huggingface.co/datasets/bigcode/starcoderdata](https://huggingface.co/datasets/bigcode/starcoderdata).
120
+
121
+ ```
122
+
123
+ ### Limitations
124
+
125
+ The model has undergone training with source code from Python, Java,
126
+ JavaScript, Go, Rust, C++, C, and C#. While the primary language in the source is English, it does
127
+ contain other languages. Therefore, the model can produce code snippets
128
+ given some context. However, there\'s no assurance that the resulting
129
+ code will function as expected. It might be suboptimal, contain bugs, or
130
+ even exploits.
131
+
132
+ ## Evaluation
133
+
134
+ Below are DeciCoder's pass@1 on MultiPL HumanEval scores
135
+
136
+ | Python | JavaScript | Java | C++ | C# | Rust | Go | C |
137
+ |:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|
138
+ | 33.5% | 29.3% | 30.3% |29.93% |20.31% |20.5% |77.47% |xx% |
139
+
140
+
141
+ ### Runtime Benchmarks
142
+
143
+ |Inference Tool/Hardware | Qualcomm AI 100 (tokens/sec) |
144
+ |:----------|:----------|
145
+ | Infery LLM | xxx |
146
+
147
+ - Throughput (tokens/sec) - Measured with an optimal batch size of 96
148
+
149
+ ## Documentation
150
+
151
+ - [Notebook](https://colab.research.google.com/drive/1JCxvBsWCZKHfIcHSMVf7GZCs3ClMQPjs) CHANGE
152
+ - Blog post: [Introducing DeciCoder: The New Gold Standard in Efficient and Accurate Code Generation](https://deci.ai/blog/decicoder-efficient-and-accurate-code-generation-llm/)CHANGE
153
+ - Questions:Feel free to contact us via our [Discord Community!](https://discord.com/invite/p9ecgRhDR8/)CHANGE
154
+
155
+ ## How to Cite
156
+
157
+ Please cite this model using this format.
158
+
159
+ ```bibtex
160
+ @misc{DeciFoundationModels,
161
+ title = {DeciCoder},
162
+ author = {DeciAI Research Team},
163
+ year = {2023}
164
+ url={[https://huggingface.co/deci/decicoder-6b](https://huggingface.co/deci/decicoder-6b)},
165
+ }
166
+ ```