merve HF staff commited on
Commit
6a78b45
1 Parent(s): edd4be3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md CHANGED
@@ -3,3 +3,33 @@ license: other
3
  license_name: gemma
4
  license_link: https://ai.google.dev/gemma/prohibited_use_policy
5
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3
  license_name: gemma
4
  license_link: https://ai.google.dev/gemma/prohibited_use_policy
5
  ---
6
+ # Gemma-7B in 8-bit with bitsandbytes
7
+
8
+ This is the repository for [Gemma-7B-it](https://huggingface.co/google/gemma-7b-it) quantized to 8-bit using bitsandbytes.
9
+ Original model card and license for Gemma-7B can be found [here](https://huggingface.co/google/gemma-7b-it#gemma-model-card).
10
+ This is the base model and it's not instruction fine-tuned.
11
+
12
+ ## Usage
13
+
14
+ Please visit original Gemma-7B-it [model card](https://huggingface.co/google/gemma-7b-it#usage-and-limitations) for intended uses and limitations.
15
+
16
+ You can use this model like following:
17
+
18
+ ```python
19
+ from transformers import AutoModelForCausalLM
20
+ model = AutoModelForCausalLM.from_pretrained(
21
+ "merve/gemma-7b-it-8bit"
22
+ )
23
+ from transformers import AutoTokenizer
24
+ tokenizer =AutoTokenizer.from_pretrained(
25
+ "google/gemma-7b-it"
26
+ )
27
+ #outputs = model.generate(**input_ids)
28
+ chat = [
29
+ { "role": "user", "content": "Write a hello world program" },
30
+ ]
31
+ prompt = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
32
+ inputs = tokenizer.encode(prompt, add_special_tokens=True, return_tensors="pt")
33
+ outputs = model.generate(input_ids=inputs.to(model.device), max_new_tokens=150)
34
+ tokenizer.decode(outputs[0])
35
+ ```