Upload README.md with huggingface_hub
Browse files
README.md
ADDED
@@ -0,0 +1,269 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: BEE-spoke-data/smol_llama-101M-GQA-python
|
3 |
+
datasets:
|
4 |
+
- BEE-spoke-data/pypi_clean-deduped
|
5 |
+
inference: false
|
6 |
+
language:
|
7 |
+
- en
|
8 |
+
license: apache-2.0
|
9 |
+
metrics:
|
10 |
+
- accuracy
|
11 |
+
model_creator: BEE-spoke-data
|
12 |
+
model_name: smol_llama-101M-GQA-python
|
13 |
+
pipeline_tag: text-generation
|
14 |
+
quantized_by: afrideva
|
15 |
+
source_model: BEE-spoke-data/smol_llama-101M-GQA
|
16 |
+
tags:
|
17 |
+
- python
|
18 |
+
- codegen
|
19 |
+
- markdown
|
20 |
+
- smol_llama
|
21 |
+
- gguf
|
22 |
+
- ggml
|
23 |
+
- quantized
|
24 |
+
- q2_k
|
25 |
+
- q3_k_m
|
26 |
+
- q4_k_m
|
27 |
+
- q5_k_m
|
28 |
+
- q6_k
|
29 |
+
- q8_0
|
30 |
+
widget:
|
31 |
+
- example_title: Add Numbers Function
|
32 |
+
text: "def add_numbers(a, b):\n return\n"
|
33 |
+
- example_title: Car Class
|
34 |
+
text: "class Car:\n def __init__(self, make, model):\n self.make = make\n
|
35 |
+
\ self.model = model\n\n def display_car(self):\n"
|
36 |
+
- example_title: Pandas DataFrame
|
37 |
+
text: 'import pandas as pd
|
38 |
+
|
39 |
+
data = {''Name'': [''Tom'', ''Nick'', ''John''], ''Age'': [20, 21, 19]}
|
40 |
+
|
41 |
+
df = pd.DataFrame(data).convert_dtypes()
|
42 |
+
|
43 |
+
# eda
|
44 |
+
|
45 |
+
'
|
46 |
+
- example_title: Factorial Function
|
47 |
+
text: "def factorial(n):\n if n == 0:\n return 1\n else:\n"
|
48 |
+
- example_title: Fibonacci Function
|
49 |
+
text: "def fibonacci(n):\n if n <= 0:\n raise ValueError(\"Incorrect input\")\n
|
50 |
+
\ elif n == 1:\n return 0\n elif n == 2:\n return 1\n else:\n"
|
51 |
+
- example_title: Matplotlib Plot
|
52 |
+
text: 'import matplotlib.pyplot as plt
|
53 |
+
|
54 |
+
import numpy as np
|
55 |
+
|
56 |
+
x = np.linspace(0, 10, 100)
|
57 |
+
|
58 |
+
# simple plot
|
59 |
+
|
60 |
+
'
|
61 |
+
- example_title: Reverse String Function
|
62 |
+
text: "def reverse_string(s:str) -> str:\n return\n"
|
63 |
+
- example_title: Palindrome Function
|
64 |
+
text: "def is_palindrome(word:str) -> bool:\n return\n"
|
65 |
+
- example_title: Bubble Sort Function
|
66 |
+
text: "def bubble_sort(lst: list):\n n = len(lst)\n for i in range(n):\n for
|
67 |
+
j in range(0, n-i-1):\n"
|
68 |
+
- example_title: Binary Search Function
|
69 |
+
text: "def binary_search(arr, low, high, x):\n if high >= low:\n mid =
|
70 |
+
(high + low) // 2\n if arr[mid] == x:\n return mid\n elif
|
71 |
+
arr[mid] > x:\n"
|
72 |
+
---
|
73 |
+
# BEE-spoke-data/smol_llama-101M-GQA-python-GGUF
|
74 |
+
|
75 |
+
Quantized GGUF model files for [smol_llama-101M-GQA-python](https://huggingface.co/BEE-spoke-data/smol_llama-101M-GQA-python) from [BEE-spoke-data](https://huggingface.co/BEE-spoke-data)
|
76 |
+
|
77 |
+
|
78 |
+
| Name | Quant method | Size |
|
79 |
+
| ---- | ---- | ---- |
|
80 |
+
| [smol_llama-101m-gqa-python.fp16.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.fp16.gguf) | fp16 | None |
|
81 |
+
| [smol_llama-101m-gqa-python.q2_k.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q2_k.gguf) | q2_k | None |
|
82 |
+
| [smol_llama-101m-gqa-python.q3_k_m.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q3_k_m.gguf) | q3_k_m | None |
|
83 |
+
| [smol_llama-101m-gqa-python.q4_k_m.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q4_k_m.gguf) | q4_k_m | None |
|
84 |
+
| [smol_llama-101m-gqa-python.q5_k_m.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q5_k_m.gguf) | q5_k_m | None |
|
85 |
+
| [smol_llama-101m-gqa-python.q6_k.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q6_k.gguf) | q6_k | None |
|
86 |
+
| [smol_llama-101m-gqa-python.q8_0.gguf](https://huggingface.co/afrideva/smol_llama-101M-GQA-python-GGUF/resolve/main/smol_llama-101m-gqa-python.q8_0.gguf) | q8_0 | None |
|
87 |
+
|
88 |
+
|
89 |
+
|
90 |
+
## Original Model Card:
|
91 |
+
# smol_llama-101M-GQA: python
|
92 |
+
|
93 |
+
> 400MB of buzz: pure Python programming nectar! ๐ฏ
|
94 |
+
|
95 |
+
This model is the general pre-trained checkpoint `BEE-spoke-data/smol_llama-101M-GQA` trained on a deduped version of `pypi` for +1 epoch. Play with the model in [this demo space](https://huggingface.co/spaces/BEE-spoke-data/beecoder-playground).
|
96 |
+
|
97 |
+
- Its architecture is the same as the base, with some new Python-related tokens added to vocab prior to training.
|
98 |
+
- It can generate basic Python code and markdown in README style, but will struggle with harder planning/reasoning tasks
|
99 |
+
- This is an experiment to test the abilities of smol-sized models in code generation; meaning **both** its capabilities and limitations
|
100 |
+
|
101 |
+
Use with care & understand that there may be some bugs ๐ still to be worked out.
|
102 |
+
|
103 |
+
## Usage
|
104 |
+
|
105 |
+
๐ Be sure to note:
|
106 |
+
|
107 |
+
1. The model uses the "slow" llama2 tokenizer. Set use_fast=False when loading the tokenizer.
|
108 |
+
2. Use transformers library version 4.33.3 due to a known issue in version 4.34.1 (_at time of writing_)
|
109 |
+
|
110 |
+
> Which llama2 tokenizer the API widget uses is an age-old mystery, and may cause minor whitespace issues (widget only).
|
111 |
+
|
112 |
+
To install the necessary packages and load the model:
|
113 |
+
|
114 |
+
```python
|
115 |
+
# Install necessary packages
|
116 |
+
# pip install transformers==4.33.3 accelerate sentencepiece
|
117 |
+
|
118 |
+
from transformers import AutoTokenizer, AutoModelForCausalLM
|
119 |
+
|
120 |
+
# Load the tokenizer and model
|
121 |
+
tokenizer = AutoTokenizer.from_pretrained(
|
122 |
+
"BEE-spoke-data/smol_llama-101M-GQA-python",
|
123 |
+
use_fast=False,
|
124 |
+
)
|
125 |
+
model = AutoModelForCausalLM.from_pretrained(
|
126 |
+
"BEE-spoke-data/smol_llama-101M-GQA-python",
|
127 |
+
device_map="auto",
|
128 |
+
)
|
129 |
+
|
130 |
+
# The model can now be used as any other decoder
|
131 |
+
```
|
132 |
+
|
133 |
+
### longer code-gen example
|
134 |
+
|
135 |
+
|
136 |
+
Below is a quick script that can be used as a reference/starting point for writing your own, better one :)
|
137 |
+
|
138 |
+
|
139 |
+
|
140 |
+
<details>
|
141 |
+
<summary>๐ฅ Unleash the Power of Code Generation! Click to Reveal the Magic! ๐ฎ</summary>
|
142 |
+
|
143 |
+
Are you ready to witness the incredible possibilities of code generation? ๐. Brace yourself for an exceptional journey into the world of artificial intelligence and programming. Observe a script that will change the way you create and finalize code.
|
144 |
+
|
145 |
+
This script provides entry to a planet where machines can write code with remarkable precision and imagination.
|
146 |
+
|
147 |
+
```python
|
148 |
+
"""
|
149 |
+
simple script for testing model(s) designed to generate/complete code
|
150 |
+
|
151 |
+
See details/args with the below.
|
152 |
+
python textgen_inference_code.py --help
|
153 |
+
"""
|
154 |
+
import logging
|
155 |
+
import random
|
156 |
+
import time
|
157 |
+
from pathlib import Path
|
158 |
+
|
159 |
+
import fire
|
160 |
+
import torch
|
161 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
162 |
+
|
163 |
+
logging.basicConfig(format="%(levelname)s - %(message)s", level=logging.INFO)
|
164 |
+
|
165 |
+
|
166 |
+
class Timer:
|
167 |
+
"""
|
168 |
+
Basic timer utility.
|
169 |
+
"""
|
170 |
+
|
171 |
+
def __enter__(self):
|
172 |
+
|
173 |
+
self.start_time = time.perf_counter()
|
174 |
+
return self
|
175 |
+
|
176 |
+
def __exit__(self, exc_type, exc_value, traceback):
|
177 |
+
|
178 |
+
self.end_time = time.perf_counter()
|
179 |
+
self.elapsed_time = self.end_time - self.start_time
|
180 |
+
logging.info(f"Elapsed time: {self.elapsed_time:.4f} seconds")
|
181 |
+
|
182 |
+
|
183 |
+
def load_model(model_name, use_fast=False):
|
184 |
+
""" util for loading model and tokenizer"""
|
185 |
+
logging.info(f"Loading model: {model_name}")
|
186 |
+
tokenizer = AutoTokenizer.from_pretrained(model_name, use_fast=use_fast)
|
187 |
+
model = AutoModelForCausalLM.from_pretrained(
|
188 |
+
model_name, torch_dtype="auto", device_map="auto"
|
189 |
+
)
|
190 |
+
model = torch.compile(model)
|
191 |
+
return tokenizer, model
|
192 |
+
|
193 |
+
|
194 |
+
def run_inference(prompt, model, tokenizer, max_new_tokens: int = 256):
|
195 |
+
"""
|
196 |
+
run_inference
|
197 |
+
|
198 |
+
Args:
|
199 |
+
prompt (TYPE): Description
|
200 |
+
model (TYPE): Description
|
201 |
+
tokenizer (TYPE): Description
|
202 |
+
max_new_tokens (int, optional): Description
|
203 |
+
|
204 |
+
Returns:
|
205 |
+
TYPE: Description
|
206 |
+
"""
|
207 |
+
logging.info(f"Running inference with max_new_tokens={max_new_tokens} ...")
|
208 |
+
with Timer() as timer:
|
209 |
+
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
|
210 |
+
outputs = model.generate(
|
211 |
+
**inputs,
|
212 |
+
max_new_tokens=max_new_tokens,
|
213 |
+
min_new_tokens=8,
|
214 |
+
renormalize_logits=True,
|
215 |
+
no_repeat_ngram_size=8,
|
216 |
+
repetition_penalty=1.04,
|
217 |
+
num_beams=4,
|
218 |
+
early_stopping=True,
|
219 |
+
)
|
220 |
+
text = tokenizer.batch_decode(outputs, skip_special_tokens=True)[0]
|
221 |
+
logging.info(f"Output text:\n\n{text}")
|
222 |
+
return text
|
223 |
+
|
224 |
+
|
225 |
+
def main(
|
226 |
+
model_name="BEE-spoke-data/smol_llama-101M-GQA-python",
|
227 |
+
prompt:str=None,
|
228 |
+
use_fast=False,
|
229 |
+
n_tokens: int = 256,
|
230 |
+
):
|
231 |
+
"""Summary
|
232 |
+
|
233 |
+
Args:
|
234 |
+
model_name (str, optional): Description
|
235 |
+
prompt (None, optional): specify the prompt directly (default: random choice from list)
|
236 |
+
n_tokens (int, optional): max new tokens to generate
|
237 |
+
"""
|
238 |
+
logging.info(f"Inference with:\t{model_name}, max_new_tokens:{n_tokens}")
|
239 |
+
|
240 |
+
if prompt is None:
|
241 |
+
prompt_list = [
|
242 |
+
'''
|
243 |
+
def print_primes(n: int):
|
244 |
+
"""
|
245 |
+
Print all primes between 1 and n
|
246 |
+
"""''',
|
247 |
+
"def quantum_analysis(",
|
248 |
+
"def sanitize_filenames(target_dir:str, recursive:False, extension",
|
249 |
+
]
|
250 |
+
prompt = random.SystemRandom().choice(prompt_list)
|
251 |
+
|
252 |
+
logging.info(f"Using prompt:\t{prompt}")
|
253 |
+
|
254 |
+
tokenizer, model = load_model(model_name, use_fast=use_fast)
|
255 |
+
|
256 |
+
run_inference(prompt, model, tokenizer, n_tokens)
|
257 |
+
|
258 |
+
|
259 |
+
if __name__ == "__main__":
|
260 |
+
fire.Fire(main)
|
261 |
+
```
|
262 |
+
|
263 |
+
Wowoweewa!! It can create some file cleaning utilities.
|
264 |
+
|
265 |
+
|
266 |
+
</details>
|
267 |
+
|
268 |
+
|
269 |
+
---
|