notstoic/OPT-13B-Erebus-4bit-128g · I get expected Float found Half error

Apr 13, 2023

trying to use this with windows and oobabooga gui, i get the following error:
...
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Float but found Half

I'd appreciate any information of how to possibly make it work. :D

autobots

Apr 14, 2023

I get this error using the autograd 4bit.py for inference. Using plain GPTQ it generates very slowly but it generates.

Result is worse than 8bit Output generated in 71.59 seconds (0.31 tokens/s, 22 tokens, context 498, seed 300042613)

wurm

May 15, 2023

•

edited May 15, 2023

got this error too, windows and updated oobagobaa:
full error log:
Traceback (most recent call last):
File "D:\AI\OOBABOOGA\text-generation-webui\modules\text_generation.py", line 246, in generate_reply_HF
output = shared.model.generate(**generate_params)[0]
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\torch\utils_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 1485, in generate
return self.sample(
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\transformers\generation\utils.py", line 2524, in sample
outputs = self(
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\transformers\models\opt\modeling_opt.py", line 938, in forward
outputs = self.model.decoder(
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\transformers\models\opt\modeling_opt.py", line 704, in forward
layer_outputs = decoder_layer(
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\transformers\models\opt\modeling_opt.py", line 326, in forward
hidden_states = self.self_attn_layer_norm(hidden_states)
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\torch\nn\modules\normalization.py", line 190, in forward
return F.layer_norm(
File "D:\AI\OOBABOOGA\installer_files\env\lib\site-packages\torch\nn\functional.py", line 2515, in layer_norm
return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled)
RuntimeError: expected scalar type Float but found Half
Output generated in 0.08 seconds (0.00 tokens/s, 0 tokens, context 34, seed 292170800)

Bing says; this error means that there is a mismatch between the data types of the input and the expected input for a PyTorch function. In this case, the function expects a Float tensor, which is a tensor with 32-bit floating-point values, but it receives a Half tensor, which is a tensor with 16-bit floating-point values. This can happen when using mixed precision training, which is a technique to speed up computation and reduce memory usage by using lower-precision data types.

idk about that..

Would be fun to see this model work in OobaBooga , curious if there is anything I can do about this :O

autobots

May 15, 2023

I think what can be done about it is encoding another model without group size.. but every way I try to work it, the thing generates slow. I will test more with a 3090 and see if it's passable. But there is no reason to use this model when the int4 30b models like alpasta/alpacino exist.