Deci
/

Text Generation
Transformers
Safetensors
deci
custom_code
danaevan commited on
Commit
f7203be
1 Parent(s): 7dcc809

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -2,7 +2,7 @@
2
 
3
  DeciCoder-6B is a 6 billion parameter decoder-only code completion model
4
  trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata).
5
- The model uses variable Grouped Query Attention and has a context window of 4096
6
  tokens. It was trained using a Fill-in-the-Middle training objective. The model's
7
  architecture was generated by Deci's proprietary Neural Architecture
8
  Search-based technology, AutoNAC.
@@ -25,7 +25,7 @@ Search-based technology, AutoNAC.
25
 
26
  | Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads |
27
  |:----------|:----------|:----------|:----------|:----------|
28
- | 6B | 32 | 32 | 4096 | Variable |
29
 
30
 
31
  - **Decoder layer:** Variable Grouped Query Attention
 
2
 
3
  DeciCoder-6B is a 6 billion parameter decoder-only code completion model
4
  trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata).
5
+ The model uses variable Grouped Query Attention and has a context window of 2k
6
  tokens. It was trained using a Fill-in-the-Middle training objective. The model's
7
  architecture was generated by Deci's proprietary Neural Architecture
8
  Search-based technology, AutoNAC.
 
25
 
26
  | Parameters | Layers | Heads | Sequence Length | GQA num_key_value_heads |
27
  |:----------|:----------|:----------|:----------|:----------|
28
+ | 6B | 32 | 32 | 2k | Variable |
29
 
30
 
31
  - **Decoder layer:** Variable Grouped Query Attention