Deci
/

DeciCoder-6B

danaevan commited on Jan 17

Commit

f7203be

•

1 Parent(s): 7dcc809

Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -2,7 +2,7 @@
 DeciCoder-6B is a 6 billion parameter decoder-only code completion model
 trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata).
-The model uses variable Grouped Query Attention and has a context window of 4096
 tokens. It was trained using a Fill-in-the-Middle training objective. The model's
 architecture was generated by Deci's proprietary Neural Architecture
 Search-based technology, AutoNAC.
@@ -25,7 +25,7 @@ Search-based technology, AutoNAC.
 | Parameters | Layers | Heads  | Sequence Length  | GQA num_key_value_heads  |
 |:----------|:----------|:----------|:----------|:----------|
-| 6B    | 32    | 32    | 4096   | Variable  |
 - **Decoder layer:** Variable Grouped Query Attention

 DeciCoder-6B is a 6 billion parameter decoder-only code completion model
 trained on the Python, Java, Javascript, Rust, C++, C, and C# subset of [Starcoder Training Dataset](https://huggingface.co/datasets/bigcode/starcoderdata).
+The model uses variable Grouped Query Attention and has a context window of 2k
 tokens. It was trained using a Fill-in-the-Middle training objective. The model's
 architecture was generated by Deci's proprietary Neural Architecture
 Search-based technology, AutoNAC.
 | Parameters | Layers | Heads  | Sequence Length  | GQA num_key_value_heads  |
 |:----------|:----------|:----------|:----------|:----------|
+| 6B    | 32    | 32    | 2k  | Variable  |
 - **Decoder layer:** Variable Grouped Query Attention