Studeni commited on
Commit
3beeee1
1 Parent(s): f186bcf

Update README.md

Browse files

## Problem:

If we do not set the device in the pipeline we first get this warning:
```
UserWarning: You are calling .generate() with the `input_ids` being on a device type different than your model's device. `input_ids` is on cpu, whereas the model is on cuda. You may experience unexpected behaviors or slower generation. Please make sure that you have put `input_ids` to the correct device by calling for example input_ids = input_ids.to('cuda') before running `.generate()`.
```
After that we get the error:
```
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

```

## Solution:

Set devices for all models and in the pipeline method to avoid tensors being on different devices.

Files changed (1) hide show
  1. README.md +5 -3
README.md CHANGED
@@ -107,13 +107,14 @@ pip3 install git+https://github.com/casper-hansen/AutoAWQ.git@1c5ccc791fa2cb0697
107
  ```python
108
  from awq import AutoAWQForCausalLM
109
  from transformers import AutoTokenizer
 
110
 
111
  model_name_or_path = "TheBloke/Mistral-7B-v0.1-AWQ"
112
 
113
  # Load model
114
  model = AutoAWQForCausalLM.from_quantized(model_name_or_path, fuse_layers=True,
115
- trust_remote_code=False, safetensors=True)
116
- tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=False)
117
 
118
  prompt = "Tell me about AI"
119
  prompt_template=f'''{prompt}
@@ -154,7 +155,8 @@ pipe = pipeline(
154
  temperature=0.7,
155
  top_p=0.95,
156
  top_k=40,
157
- repetition_penalty=1.1
 
158
  )
159
 
160
  print(pipe(prompt_template)[0]['generated_text'])
 
107
  ```python
108
  from awq import AutoAWQForCausalLM
109
  from transformers import AutoTokenizer
110
+ import torch
111
 
112
  model_name_or_path = "TheBloke/Mistral-7B-v0.1-AWQ"
113
 
114
  # Load model
115
  model = AutoAWQForCausalLM.from_quantized(model_name_or_path, fuse_layers=True,
116
+ trust_remote_code=False, safetensors=True, device=device)
117
+ tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=False, device=device)
118
 
119
  prompt = "Tell me about AI"
120
  prompt_template=f'''{prompt}
 
155
  temperature=0.7,
156
  top_p=0.95,
157
  top_k=40,
158
+ repetition_penalty=1.1,
159
+ device=device,
160
  )
161
 
162
  print(pipe(prompt_template)[0]['generated_text'])