Update README.md
Browse files
README.md
CHANGED
@@ -2,7 +2,7 @@
|
|
2 |
|
3 |
DeciCoder-6B is a 6 billion parameter decoder-only code completion model
|
4 |
trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata).
|
5 |
-
The model uses variable Grouped Query Attention and has a context window of
|
6 |
tokens. It was trained using a Fill-in-the-Middle training objective. The model's
|
7 |
architecture was generated by Deci's proprietary Neural Architecture
|
8 |
Search-based technology, AutoNAC.
|
@@ -25,7 +25,7 @@ Search-based technology, AutoNAC.
|
|
25 |
|
26 |
| Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads |
|
27 |
|:----------|:----------|:----------|:----------|:----------|
|
28 |
-
| 6B | 32 | 32 |
|
29 |
|
30 |
|
31 |
- **Decoder layer:** Variable Grouped Query Attention
|
|
|
2 |
|
3 |
DeciCoder-6B is a 6 billion parameter decoder-only code completion model
|
4 |
trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata).
|
5 |
+
The model uses variable Grouped Query Attention and has a context window of 2k
|
6 |
tokens. It was trained using a Fill-in-the-Middle training objective. The model's
|
7 |
architecture was generated by Deci's proprietary Neural Architecture
|
8 |
Search-based technology, AutoNAC.
|
|
|
25 |
|
26 |
| Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads |
|
27 |
|:----------|:----------|:----------|:----------|:----------|
|
28 |
+
| 6B | 32 | 32 | 2k | Variable |
|
29 |
|
30 |
|
31 |
- **Decoder layer:** Variable Grouped Query Attention
|