Update README.md
Browse files
README.md
CHANGED
@@ -5,7 +5,7 @@ datasets:
|
|
5 |
|
6 |
## Model Details
|
7 |
|
8 |
-
This model is an int4 model with group_size 128 and symmetric quantization of [falcon-three-7b]() generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision
|
9 |
|
10 |
## How To Use
|
11 |
### INT4 Inference(CPU/HPU/CUDA)
|
@@ -18,7 +18,7 @@ tokenizer = AutoTokenizer.from_pretrained(quantized_model_dir)
|
|
18 |
model = AutoModelForCausalLM.from_pretrained(
|
19 |
quantized_model_dir,
|
20 |
device_map="auto"
|
21 |
-
## revision="" ##AutoGPTQ format
|
22 |
## revision="e9aa317" ##AutoAWQ format
|
23 |
)
|
24 |
text = "How many r in strawberry? The answer is "
|
|
|
5 |
|
6 |
## Model Details
|
7 |
|
8 |
+
This model is an int4 model with group_size 128 and symmetric quantization of [falcon-three-7b]() generated by [intel/auto-round](https://github.com/intel/auto-round). Load the model with revision `a10e358` to use AutoGPTQ format, with revision `e9aa317` to use AutoAWQ format
|
9 |
|
10 |
## How To Use
|
11 |
### INT4 Inference(CPU/HPU/CUDA)
|
|
|
18 |
model = AutoModelForCausalLM.from_pretrained(
|
19 |
quantized_model_dir,
|
20 |
device_map="auto"
|
21 |
+
## revision="a10e358" ##AutoGPTQ format
|
22 |
## revision="e9aa317" ##AutoAWQ format
|
23 |
)
|
24 |
text = "How many r in strawberry? The answer is "
|