Deci
/

Text Generation
Transformers
Safetensors
deci
custom_code
danaevan commited on
Commit
e87c364
1 Parent(s): e084f01

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -29
README.md CHANGED
@@ -10,7 +10,7 @@ programming_language:
10
  - JavaScript
11
  - Python
12
  - Rust
13
- - Go
14
  - C++
15
  - C
16
  - C#
@@ -58,10 +58,10 @@ datasets:
58
  - bigcode/starcoderdata
59
  ---
60
 
61
- # Model Card for DeciCoder 6B
62
 
63
- DeciCoder 6B is a 6 billion parameter decoder-only code completion model
64
- trained on the Python, Java, Javascript, Go, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata)..
65
  The model uses variable Grouped Query Attention and has a context window of 4096
66
  tokens. It was trained using a Fill-in-the-Middle training objective. The model's
67
  architecture was generated by Deci's proprietary Neural Architecture
@@ -70,10 +70,17 @@ Search-based technology, AutoNAC.
70
  ## Model Details
71
 
72
  - **Developed by:** Deci
73
- - **Model type:** DeciCoder is an auto-regressive language model based on the transformer decoder architecture, using variable Grouped Query Attention.
74
- - **Language(s):** Python, Java, JavaScript, Go, Rust, C++, C, C#
75
  - **License:** Model checkpoints are licensed under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
76
 
 
 
 
 
 
 
 
77
  ## Model Architecture
78
 
79
  | Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads | Hidden Size |
@@ -81,12 +88,12 @@ Search-based technology, AutoNAC.
81
  | 6B | 32 | 32 | 4096 | Variable | 4096 | |
82
 
83
 
84
- - **Decoder layer:** Variable Grouped Query Attention. Grouped Query Attention was introduced in [Ainslie et al., 2023](https://arxiv.org/abs/2305.13245)
85
  - **Position Embeddings:** Rotary Position Embeddings [Su et al., 2021](https://arxiv.org/abs/2104.09864)
86
 
87
  ## Uses
88
 
89
- The model is intended to do single/multiline code completion from a
90
  context window of up to 4096k tokens. It is *not* an instruction model
91
  and commands like \"Write a function that computes the absolute value of
92
  an integer,\" won't yield the desired results. A more effective approach
@@ -114,8 +121,8 @@ print(tokenizer.decode(outputs[0]))
114
 
115
  ### Attribution
116
 
117
- DeciCoder was trained on StarCoder Training Dataset, filtered for
118
- Python, Java, JavaScript, Rust, Go, C++, C, and C#. For additional information, please
119
  refer to [https://huggingface.co/datasets/bigcode/starcoderdata](https://huggingface.co/datasets/bigcode/starcoderdata).
120
 
121
  ```
@@ -123,34 +130,28 @@ refer to [https://huggingface.co/datasets/bigcode/starcoderdata](https://hugging
123
  ### Limitations
124
 
125
  The model has undergone training with source code from Python, Java,
126
- JavaScript, Go, Rust, C++, C, and C#. While the primary language in the source is English, it does
127
  contain other languages. Therefore, the model can produce code snippets
128
- given some context. However, there\'s no assurance that the resulting
129
  code will function as expected. It might be suboptimal, contain bugs, or
130
  even exploits.
131
 
132
  ## Evaluation
133
 
134
- Below are DeciCoder's pass@1 on MultiPL HumanEval scores
135
 
136
- | Python | JavaScript | Java | C++ | C# | Rust | Go | C |
137
- |:----------|:----------|:----------|:----------|:----------|:----------|:----------|:----------|
138
- | 33.5% | 29.3% | 30.3% |29.93% |20.31% |20.5% |77.47% |xx% |
139
 
140
 
141
  ### Runtime Benchmarks
142
 
143
- |Inference Tool/Hardware | Qualcomm AI 100 (tokens/sec) |
144
- |:----------|:----------|
145
- | Infery LLM | xxx |
146
 
147
- - Throughput (tokens/sec) - Measured with an optimal batch size of 96
148
-
149
- ## Documentation
150
-
151
- - [Notebook](https://colab.research.google.com/drive/1JCxvBsWCZKHfIcHSMVf7GZCs3ClMQPjs) CHANGE
152
- - Blog post: [Introducing DeciCoder: The New Gold Standard in Efficient and Accurate Code Generation](https://deci.ai/blog/decicoder-efficient-and-accurate-code-generation-llm/)CHANGE
153
- - Questions:Feel free to contact us via our [Discord Community!](https://discord.com/invite/p9ecgRhDR8/)CHANGE
154
 
155
  ## How to Cite
156
 
@@ -158,9 +159,9 @@ Please cite this model using this format.
158
 
159
  ```bibtex
160
  @misc{DeciFoundationModels,
161
- title = {DeciCoder},
162
  author = {DeciAI Research Team},
163
  year = {2023}
164
- url={[https://huggingface.co/deci/decicoder-6b](https://huggingface.co/deci/decicoder-6b)},
165
  }
166
- ```
 
10
  - JavaScript
11
  - Python
12
  - Rust
13
+ - Ruby
14
  - C++
15
  - C
16
  - C#
 
58
  - bigcode/starcoderdata
59
  ---
60
 
61
+ # Model Card for DeciCoder-6B
62
 
63
+ DeciCoder-6B is a 6 billion parameter decoder-only code completion model
64
+ trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata).
65
  The model uses variable Grouped Query Attention and has a context window of 4096
66
  tokens. It was trained using a Fill-in-the-Middle training objective. The model's
67
  architecture was generated by Deci's proprietary Neural Architecture
 
70
  ## Model Details
71
 
72
  - **Developed by:** Deci
73
+ - **Model type:** DeciCoder-6B is an auto-regressive language model based on the transformer decoder architecture, using variable Grouped Query Attention.
74
+ - **Language(s):** Python, Java, JavaScript, Ruby, Rust, C++, C, C#
75
  - **License:** Model checkpoints are licensed under the [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
76
 
77
+ ## Documentation
78
+
79
+ - Google Colab [Notebook](https://colab.research.google.com/drive/1ZxG9qMlom9vn4lSGlD8PrjwHBvag94ei?usp=sharing)
80
+ - Blog Post: [Introducing DeciCoder-6B: The Best Multi-Language Code Generation LLM in Its Class](https://deci.ai/blog/decicoder-6b-the-best-multi-language-code-generation-llm-in-its-class/)
81
+ - Tutorial: [How to Run DeciCoder-6B on Qualcomm AI 100](https://github.com/quic/cloud-ai-sdk/tree/1.12/models/language_processing/decoder)
82
+ - Questions: Feel free to contact us via our [Discord Community!](https://discord.com/invite/p9ecgRhDR8/)
83
+
84
  ## Model Architecture
85
 
86
  | Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads | Hidden Size |
 
88
  | 6B | 32 | 32 | 4096 | Variable | 4096 | |
89
 
90
 
91
+ - **Decoder layer:** Variable Grouped Query Attention
92
  - **Position Embeddings:** Rotary Position Embeddings [Su et al., 2021](https://arxiv.org/abs/2104.09864)
93
 
94
  ## Uses
95
 
96
+ The model is intended to perform single/multiline code completion from a
97
  context window of up to 4096k tokens. It is *not* an instruction model
98
  and commands like \"Write a function that computes the absolute value of
99
  an integer,\" won't yield the desired results. A more effective approach
 
121
 
122
  ### Attribution
123
 
124
+ DeciCoder-6B was trained on StarCoder Training Dataset, filtered for
125
+ Python, Java, JavaScript, Ruby, RUST, C++, C, and C#. For additional information, please
126
  refer to [https://huggingface.co/datasets/bigcode/starcoderdata](https://huggingface.co/datasets/bigcode/starcoderdata).
127
 
128
  ```
 
130
  ### Limitations
131
 
132
  The model has undergone training with source code from Python, Java,
133
+ JavaScript, Ruby, RUST, C++, C, and C#. While the primary language in the source is English, it does
134
  contain other languages. Therefore, the model can produce code snippets
135
+ given some context. However, there is no assurance that the resulting
136
  code will function as expected. It might be suboptimal, contain bugs, or
137
  even exploits.
138
 
139
  ## Evaluation
140
 
141
+ Below are DeciCoder-6B's pass@1 on MultiPL HumanEval scores
142
 
143
+ | Python | JavaScript | Java | C++ | C# | Rust | Go |
144
+ |:----------|:----------|:----------|:----------|:----------|:----------|:----------|
145
+ | 33.3% | 29.3% | 30.3% |29.93% |20.31% |20.5% |77.47% |
146
 
147
 
148
  ### Runtime Benchmarks
149
 
150
+ |Inference Tool | Hardware | Prompt Length | Generation Length | Throughput (tokens/sec) |
151
+ |:----------|:----------|:----------|:----------|:----------|
152
+ | Qualcomm SDK | Qualcomm AI 100 | 1024 | 1024 | 531.3 |
153
 
154
+ - Measured for maximal batch size on the device
 
 
 
 
 
 
155
 
156
  ## How to Cite
157
 
 
159
 
160
  ```bibtex
161
  @misc{DeciFoundationModels,
162
+ title = {DeciCoder-6B},
163
  author = {DeciAI Research Team},
164
  year = {2023}
165
+ url={[https://huggingface.co/deci/decicoder-6B](https://huggingface.co/deci/decicoder-6B)},
166
  }
167
+ ```