Output failure

#1
by StableDiffusion69 - opened

I can load the model in oobabooga with the cpu and desc_act switch on my 8GB VRAM card. But when I enter something, there is no response and I get this error:

2023-07-02 09:03:45 INFO:Loading JCTN_pygmalion-13b-4bit-128g...
2023-07-02 09:03:45 INFO:The AutoGPTQ params are: {'model_basename': '4bit-128g', 'device': 'cpu', 'use_triton': False, 'inject_fused_attention': True, 'inject_fused_mlp': True, 'use_safetensors': True, 'trust_remote_code': False, 'max_memory': None, 'quantize_config': BaseQuantizeConfig(bits=4, group_size=128, damp_percent=0.01, desc_act=True, sym=True, true_sequential=True, model_name_or_path=None, model_file_base_name=None), 'use_cuda_fp16': True}
2023-07-02 09:03:45 WARNING:The model weights are not tied. Please use the tie_weights method before using the infer_auto_device function.
2023-07-02 09:03:45 WARNING:The safetensors archive passed at models\JCTN_pygmalion-13b-4bit-128g\4bit-128g.safetensors does not contain metadata. Make sure to save your model with the save_pretrained method. Defaulting to 'pt' metadata.
2023-07-02 09:04:32 WARNING:skip module injection for FusedLlamaMLPForQuantizedModel not support integrate without triton yet.
2023-07-02 09:04:32 INFO:Loaded the model in 47.16 seconds.

============================================================
Traceback (most recent call last):
File "F:\Programme\oobabooga_windows\text-generation-webui\modules\callbacks.py", line 55, in gentask
ret = self.mfunc(callback=_callback, *args, **self.kwargs)
File "F:\Programme\oobabooga_windows\text-generation-webui\modules\text_generation.py", line 289, in generate_with_callback
shared.model.generate(**kwargs)
File "F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\auto_gptq\modeling_base.py", line 422, in generate
with torch.inference_mode(), torch.amp.autocast(device_type=self.device.type):
File "F:\Programme\oobabooga_windows\installer_files\env\lib\site-packages\auto_gptq\modeling_base.py", line 411, in device
device = [d for d in self.hf_device_map.values() if d not in {'cpu', 'disk'}][0]
IndexError: list index out of range
Output generated in 0.32 seconds (0.00 tokens/s, 0 tokens, context 1056, seed 1992812162)

Any ideas on how to fix this, please? πŸ€”

Sign up or log in to comment