Update README.md
Browse files
README.md
CHANGED
@@ -3,3 +3,33 @@ license: other
|
|
3 |
license_name: gemma
|
4 |
license_link: https://ai.google.dev/gemma/prohibited_use_policy
|
5 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3 |
license_name: gemma
|
4 |
license_link: https://ai.google.dev/gemma/prohibited_use_policy
|
5 |
---
|
6 |
+
# Gemma-7B in 8-bit with bitsandbytes
|
7 |
+
|
8 |
+
This is the repository for [Gemma-7B-it](https://huggingface.co/google/gemma-7b-it) quantized to 8-bit using bitsandbytes.
|
9 |
+
Original model card and license for Gemma-7B can be found [here](https://huggingface.co/google/gemma-7b-it#gemma-model-card).
|
10 |
+
This is the base model and it's not instruction fine-tuned.
|
11 |
+
|
12 |
+
## Usage
|
13 |
+
|
14 |
+
Please visit original Gemma-7B-it [model card](https://huggingface.co/google/gemma-7b-it#usage-and-limitations) for intended uses and limitations.
|
15 |
+
|
16 |
+
You can use this model like following:
|
17 |
+
|
18 |
+
```python
|
19 |
+
from transformers import AutoModelForCausalLM
|
20 |
+
model = AutoModelForCausalLM.from_pretrained(
|
21 |
+
"merve/gemma-7b-it-8bit"
|
22 |
+
)
|
23 |
+
from transformers import AutoTokenizer
|
24 |
+
tokenizer =AutoTokenizer.from_pretrained(
|
25 |
+
"google/gemma-7b-it"
|
26 |
+
)
|
27 |
+
#outputs = model.generate(**input_ids)
|
28 |
+
chat = [
|
29 |
+
{ "role": "user", "content": "Write a hello world program" },
|
30 |
+
]
|
31 |
+
prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
|
32 |
+
inputs = tokenizer.encode(prompt, add_special_tokens=True, return_tensors="pt")
|
33 |
+
outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150)
|
34 |
+
tokenizer.decode(outputs[0])
|
35 |
+
```
|