error in run model in zero GPU

#12
by minhdang - opened
ZeroGPU Explorers org

I am running a decoder model but when the model operation fails like this, what should I do?
Assistant:
Exception in thread Thread-10 (generate):
Traceback (most recent call last):
File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.10/threading.py", line 953, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/transformers/generation/utils.py", line 1575, in generate
result = self._sample(
File "/usr/local/lib/python3.10/site-packages/transformers/generation/utils.py", line 2697, in _sample
outputs = self(
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1196, in forward
outputs = self.model(
File "/usr/local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 990, in forward
causal_mask = self._update_causal_mask(attention_mask, inputs_embeds, cache_position)
File "/usr/local/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1076, in _update_causal_mask
causal_mask = torch.triu(causal_mask, diagonal=1)
RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16'

ZeroGPU Explorers org
edited Mar 29

Hi, have you tested your Space on a regular (non ZeroGPU) hardware ? (T4 or A10G)

Sign up or log in to comment