yintongl commited on
Commit
fdbb37d
1 Parent(s): 76bf1c6

Update README.md

Browse files

add itrex inference eg

Files changed (1) hide show
  1. README.md +19 -0
README.md CHANGED
@@ -16,6 +16,25 @@ This model is an int4 model with group_size 128 of [google/gemma-2b](https://hug
16
 
17
 
18
  ### Use the model
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  ### INT4 Inference with AutoGPTQ's kernel
21
 
 
16
 
17
 
18
  ### Use the model
19
+ ### INT4 Inference with ITREX on CPU
20
+ Install the latest [intel-extension-for-transformers](
21
+ https://github.com/intel/intel-extension-for-transformers)
22
+ ```python
23
+ from intel_extension_for_transformers.transformers import AutoModelForCausalLM
24
+ from transformers import AutoTokenizer
25
+ quantized_model_dir = "Intel/gemma-2b-int4-inc"
26
+ model = AutoModelForCausalLM.from_pretrained(quantized_model_dir,
27
+ device_map="auto",
28
+ trust_remote_code=False,
29
+ use_neural_speed=False,
30
+ )
31
+ tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir, use_fast=True)
32
+ print(tokenizer.decode(model.generate(**tokenizer("There is a girl who likes adventure,", return_tensors="pt").to(model.device),max_new_tokens=50)[0]))
33
+ """
34
+ <bos>There is a girl who likes adventure, and she is a girl who likes to travel. She is a girl who likes to explore the world and see new things. She is a girl who likes to meet new people and learn about their cultures. She is a girl who likes to take risks
35
+ """
36
+ ```
37
+
38
 
39
  ### INT4 Inference with AutoGPTQ's kernel
40