Text Generation
Transformers
Safetensors
GGUF
English
stablelm
causal-lm
code
Eval Results
Inference Endpoints
7 papers
nikhilpinnaparaju commited on
Commit
102d1e9
1 Parent(s): 35a71f9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -32
README.md CHANGED
@@ -16,11 +16,11 @@ metrics:
16
  - code_eval
17
  library_name: transformers
18
  ---
19
- # `stable-code-completion-1.0-3b`
20
 
21
  ## Model Description
22
 
23
- `stable-code-completion-1.0-3b` is a 2.7B billion parameter decoder-only language model pre-trained on 1.3 trillion tokens of diverse textual and code datasets. `stable-code-completion-1.0-3b` is trained on nearly 20 programming languages (selected based on the 2023 StackOverflow Developer Survey) and demonstrates state-of-the-art performance (compared to models of similar size) on the MultiPL-E metrics across multiple programming languages tested using [BigCode's Evaluation Harness](https://github.com/bigcode-project/bigcode-evaluation-harness/tree/main).
24
 
25
  **Key Features**
26
  * Fill in Middle Capability (FIM)
@@ -28,23 +28,19 @@ library_name: transformers
28
 
29
  ## Usage
30
 
31
- Get started generating text with `stable-code-completion-1.0-3b` by using the following code snippet:
32
 
33
  ```python
34
  import torch
35
  from transformers import AutoModelForCausalLM, AutoTokenizer
36
- tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-completion-1.0-3b", trust_remote_code=True)
37
  model = AutoModelForCausalLM.from_pretrained(
38
- "stabilityai/stable-code-completion-1.0-3b",
39
  trust_remote_code=True,
40
  torch_dtype="auto",
41
  )
42
-
43
- device = "cpu"
44
- if torch.cuda.is_available():
45
- device = "cuda"
46
-
47
- inputs = tokenizer("import torch\nimport torch.nn as nn", return_tensors="pt").to(device)
48
  tokens = model.generate(
49
  **inputs,
50
  max_new_tokens=48,
@@ -61,19 +57,15 @@ print(tokenizer.decode(tokens[0], skip_special_tokens=True))
61
 
62
  ```python
63
  from transformers import AutoModelForCausalLM, AutoTokenizer
64
- tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-completion-1.0-3b", trust_remote_code=True)
65
  model = AutoModelForCausalLM.from_pretrained(
66
- "stabilityai/stable-code-completion-1.0-3b",
67
  trust_remote_code=True,
68
  torch_dtype="auto",
69
  + attn_implementation="flash_attention_2",
70
  )
71
-
72
- device = "cpu"
73
- if torch.cuda.is_available():
74
- device = "cuda"
75
-
76
- inputs = tokenizer("<fim_prefix>def fib(n):<fim_suffix> else:\n return fib(n - 2) + fib(n - 1)<fim_middle>", return_tensors="pt").to("cuda")
77
  tokens = model.generate(
78
  **inputs,
79
  max_new_tokens=48,
@@ -92,19 +84,15 @@ print(tokenizer.decode(tokens[0], skip_special_tokens=True))
92
 
93
  ```python
94
  from transformers import AutoModelForCausalLM, AutoTokenizer
95
- tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-completion-1.0-3b", trust_remote_code=True)
96
  model = AutoModelForCausalLM.from_pretrained(
97
- "stabilityai/stable-code-completion-1.0-3b",
98
  trust_remote_code=True,
99
  torch_dtype="auto",
100
  + attn_implementation="flash_attention_2",
101
  )
102
-
103
- device = "cpu"
104
- if torch.cuda.is_available():
105
- device = "cuda"
106
-
107
- inputs = tokenizer("import torch\nimport torch.nn as nn", return_tensors="pt").to("cuda")
108
  tokens = model.generate(
109
  **inputs,
110
  max_new_tokens=48,
@@ -120,7 +108,7 @@ print(tokenizer.decode(tokens[0], skip_special_tokens=True))
120
  ## Model Details
121
 
122
  * **Developed by**: [Stability AI](https://stability.ai/)
123
- * **Model type**: `stable-code-completion-1.0-3b` models are auto-regressive language models based on the transformer decoder architecture.
124
  * **Language(s)**: English, Code
125
  * **Library**: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
126
  * **License**: Other
@@ -149,7 +137,7 @@ The model is pre-trained on the aforementioned datasets in `bfloat16` precision,
149
 
150
  ### Training Infrastructure
151
 
152
- * **Hardware**: `stable-code-completion-1.0-3b` was trained on the Stability AI cluster across 256 NVIDIA A100 40GB GPUs (AWS P4d instances).
153
 
154
  * **Software**: We use a fork of `gpt-neox` ([EleutherAI, 2021](https://github.com/EleutherAI/gpt-neox)), train under 2D parallelism (Data and Tensor Parallel) with ZeRO-1 ([Rajbhandari et al., 2019](https://arxiv.org/abs/1910.02054v3)), and rely on flash-attention as well as SwiGLU and Rotary Embedding kernels from FlashAttention-2 ([Dao et al., 2023](https://tridao.me/publications/flash2/flash2.pdf))
155
 
@@ -166,9 +154,9 @@ As a base model, this model may exhibit unreliable, unsafe, or other undesirable
166
  ## How to Cite
167
 
168
  ```bibtex
169
- @misc{stable-code-completion-1.0-3b,
170
- url={[https://huggingface.co/stabilityai/stable-code-completion-1.0-3b](https://huggingface.co/stabilityai/stable-code-completion-1.0-3b)},
171
  title={Stable Code 3B},
172
  author={Pinnaparaju, Nikhil and Adithyan, Reshinth and Phung, Duy and Tow, Jonathan and Baicoianu, James and and Cooper, Nathan}
173
  }
174
- ```
 
16
  - code_eval
17
  library_name: transformers
18
  ---
19
+ # `stable-code-3b`
20
 
21
  ## Model Description
22
 
23
+ `stable-code-3b` is a 2.7B billion parameter decoder-only language model pre-trained on 1.3 trillion tokens of diverse textual and code datasets. `stable-code-3b` is trained on nearly 20 programming languages (selected based on the 2023 StackOverflow Developer Survey) and demonstrates state-of-the-art performance (compared to models of similar size) on the MultiPL-E metrics across multiple programming languages tested using [BigCode's Evaluation Harness](https://github.com/bigcode-project/bigcode-evaluation-harness/tree/main).
24
 
25
  **Key Features**
26
  * Fill in Middle Capability (FIM)
 
28
 
29
  ## Usage
30
 
31
+ Get started generating text with `stable-code-3b` by using the following code snippet:
32
 
33
  ```python
34
  import torch
35
  from transformers import AutoModelForCausalLM, AutoTokenizer
36
+ tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-3b", trust_remote_code=True)
37
  model = AutoModelForCausalLM.from_pretrained(
38
+ "stabilityai/stable-code-3b",
39
  trust_remote_code=True,
40
  torch_dtype="auto",
41
  )
42
+ model.cuda()
43
+ inputs = tokenizer("import torch\nimport torch.nn as nn", return_tensors="pt").to(model.device)
 
 
 
 
44
  tokens = model.generate(
45
  **inputs,
46
  max_new_tokens=48,
 
57
 
58
  ```python
59
  from transformers import AutoModelForCausalLM, AutoTokenizer
60
+ tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-3b", trust_remote_code=True)
61
  model = AutoModelForCausalLM.from_pretrained(
62
+ "stabilityai/stable-code-3b",
63
  trust_remote_code=True,
64
  torch_dtype="auto",
65
  + attn_implementation="flash_attention_2",
66
  )
67
+ model.cuda()
68
+ inputs = tokenizer("<fim_prefix>def fib(n):<fim_suffix> else:\n return fib(n - 2) + fib(n - 1)<fim_middle>", return_tensors="pt").to(model.device)
 
 
 
 
69
  tokens = model.generate(
70
  **inputs,
71
  max_new_tokens=48,
 
84
 
85
  ```python
86
  from transformers import AutoModelForCausalLM, AutoTokenizer
87
+ tokenizer = AutoTokenizer.from_pretrained("stabilityai/stable-code-3b", trust_remote_code=True)
88
  model = AutoModelForCausalLM.from_pretrained(
89
+ "stabilityai/stable-code-3b",
90
  trust_remote_code=True,
91
  torch_dtype="auto",
92
  + attn_implementation="flash_attention_2",
93
  )
94
+ model.cuda()
95
+ inputs = tokenizer("import torch\nimport torch.nn as nn", return_tensors="pt").to(model.device)
 
 
 
 
96
  tokens = model.generate(
97
  **inputs,
98
  max_new_tokens=48,
 
108
  ## Model Details
109
 
110
  * **Developed by**: [Stability AI](https://stability.ai/)
111
+ * **Model type**: `stable-code-3b` models are auto-regressive language models based on the transformer decoder architecture.
112
  * **Language(s)**: English, Code
113
  * **Library**: [GPT-NeoX](https://github.com/EleutherAI/gpt-neox)
114
  * **License**: Other
 
137
 
138
  ### Training Infrastructure
139
 
140
+ * **Hardware**: `stable-code-3b` was trained on the Stability AI cluster across 256 NVIDIA A100 40GB GPUs (AWS P4d instances).
141
 
142
  * **Software**: We use a fork of `gpt-neox` ([EleutherAI, 2021](https://github.com/EleutherAI/gpt-neox)), train under 2D parallelism (Data and Tensor Parallel) with ZeRO-1 ([Rajbhandari et al., 2019](https://arxiv.org/abs/1910.02054v3)), and rely on flash-attention as well as SwiGLU and Rotary Embedding kernels from FlashAttention-2 ([Dao et al., 2023](https://tridao.me/publications/flash2/flash2.pdf))
143
 
 
154
  ## How to Cite
155
 
156
  ```bibtex
157
+ @misc{stable-code-3b,
158
+ url={[https://huggingface.co/stabilityai/stable-code-3b](https://huggingface.co/stabilityai/stable-code-3b)},
159
  title={Stable Code 3B},
160
  author={Pinnaparaju, Nikhil and Adithyan, Reshinth and Phung, Duy and Tow, Jonathan and Baicoianu, James and and Cooper, Nathan}
161
  }
162
+ ```