abhinavkulkarni commited on
Commit
01ab93b
1 Parent(s): b2720ce

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -29,6 +29,8 @@ Please refer to the AWQ quantization license ([link](https://github.com/llm-awq/
29
 
30
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
31
 
 
 
32
  ## How to Use
33
 
34
  ```bash
@@ -65,7 +67,7 @@ q_config = {
65
  load_quant = hf_hub_download('abhinavkulkarni/psmathur-orca_mini_v2_7b-w4-g128-awq', 'pytorch_model.bin')
66
 
67
  with init_empty_weights():
68
- model = AutoModelForCausalLM.from_pretrained(model_name, config=config,
69
  torch_dtype=torch.float16, trust_remote_code=True)
70
 
71
  real_quantize_model_weight(model, w_bit=w_bit, q_config=q_config, init_only=True)
 
29
 
30
  This model was successfully tested on CUDA driver v530.30.02 and runtime v11.7 with Python v3.10.11. Please note that AWQ requires NVIDIA GPUs with compute capability of 80 or higher.
31
 
32
+ For Docker users, the `nvcr.io/nvidia/pytorch:23.06-py3` image is runtime v12.1 but otherwise the same as the configuration above and has also been verified to work.
33
+
34
  ## How to Use
35
 
36
  ```bash
 
67
  load_quant = hf_hub_download('abhinavkulkarni/psmathur-orca_mini_v2_7b-w4-g128-awq', 'pytorch_model.bin')
68
 
69
  with init_empty_weights():
70
+ model = AutoModelForCausalLM.from_config(config=config,
71
  torch_dtype=torch.float16, trust_remote_code=True)
72
 
73
  real_quantize_model_weight(model, w_bit=w_bit, q_config=q_config, init_only=True)