abhinavkulkarni commited on
Commit
b27e448
1 Parent(s): 3847451

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -25,6 +25,8 @@ Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/
25
 
26
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
27
 
 
 
28
  ## How to Use
29
 
30
  ```bash
@@ -61,7 +63,7 @@ q_config = {
61
  load_quant = hf_hub_download('abhinavkulkarni/mpt-7b-chat-w4-g128-awq', 'pytorch_model.bin')
62
 
63
  with init_empty_weights():
64
- model = AutoModelForCausalLM.from_pretrained(model_name, config=config,
65
  torch_dtype=torch.float16, trust_remote_code=True)
66
 
67
  real_quantize_model_weight(model, w_bit=w_bit, q_config=q_config, init_only=True)
 
25
 
26
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
27
 
28
+ For Docker users, the `nvcr.io/nvidia/pytorch:23.06-py3` image is runtime v12.1 but otherwise the same as the configuration above and has also been verified to work.
29
+
30
  ## How to Use
31
 
32
  ```bash
 
63
  load_quant = hf_hub_download('abhinavkulkarni/mpt-7b-chat-w4-g128-awq', 'pytorch_model.bin')
64
 
65
  with init_empty_weights():
66
+ model = AutoModelForCausalLM.from_config(config=config,
67
  torch_dtype=torch.float16, trust_remote_code=True)
68
 
69
  real_quantize_model_weight(model, w_bit=w_bit, q_config=q_config, init_only=True)