lvwerra HF staff commited on
Commit
db9afcd
1 Parent(s): 0115eb7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -33,7 +33,7 @@ tags:
33
  StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages from [The Stack v2](https://huggingface.co/datasets/bigcode/the-stack-v2-train), with opt-out requests excluded. The model uses [Grouped Query Attention](https://arxiv.org/abs/2305.13245), [a context window of 16,384 tokens](https://arxiv.org/abs/2205.14135) with [a sliding window attention of 4,096 tokens](https://arxiv.org/abs/2004.05150v2), and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 4+ trillion tokens.
34
 
35
  - **Project Website:** [bigcode-project.org](https://www.bigcode-project.org)
36
- - **Paper:** TODO
37
  - **Point of Contact:** [contact@bigcode-project.org](mailto:contact@bigcode-project.org)
38
  - **Languages:** 600+ Programming languages
39
 
@@ -148,4 +148,4 @@ The model is licensed under the BigCode OpenRAIL-M v1 license agreement. You can
148
 
149
  # Citation
150
 
151
- TODO
 
33
  StarCoder2-15B model is a 15B parameter model trained on 600+ programming languages from [The Stack v2](https://huggingface.co/datasets/bigcode/the-stack-v2-train), with opt-out requests excluded. The model uses [Grouped Query Attention](https://arxiv.org/abs/2305.13245), [a context window of 16,384 tokens](https://arxiv.org/abs/2205.14135) with [a sliding window attention of 4,096 tokens](https://arxiv.org/abs/2004.05150v2), and was trained using the [Fill-in-the-Middle objective](https://arxiv.org/abs/2207.14255) on 4+ trillion tokens.
34
 
35
  - **Project Website:** [bigcode-project.org](https://www.bigcode-project.org)
36
+ - **Paper:** [Link](https://huggingface.co/datasets/bigcode/the-stack-v2/)
37
  - **Point of Contact:** [contact@bigcode-project.org](mailto:contact@bigcode-project.org)
38
  - **Languages:** 600+ Programming languages
39
 
 
148
 
149
  # Citation
150
 
151
+ _Coming soon_