HELP!!how can i deal with the problem?

#9
by vitm - opened

Loading checkpoint shards: 100%|██████████████████| 2/2 [00:04<00:00, 2.05s/it]
generation_config.json: 100%|██████████████████| 137/137 [00:00<00:00, 7.52kB/s]
Traceback (most recent call last):
File "test1.py", line 27, in
print("nexa model result:\n", inference(nexa_query))
File "test1.py", line 9, in inference
outputs = model.generate(
File "/home/vipuser/miniconda3/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/vipuser/.local/lib/python3.8/site-packages/transformers/generation/utils.py", line 1527, in generate
result = self._greedy_search(
File "/home/vipuser/.local/lib/python3.8/site-packages/transformers/generation/utils.py", line 2411, in _greedy_search
outputs = self(
File "/home/vipuser/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/vipuser/.local/lib/python3.8/site-packages/transformers/models/gemma/modeling_gemma.py", line 1105, in forward
outputs = self.model(
File "/home/vipuser/miniconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/vipuser/.local/lib/python3.8/site-packages/transformers/models/gemma/modeling_gemma.py", line 891, in forward
causal_mask = self._update_causal_mask(attention_mask, inputs_embeds, cache_position)
File "/home/vipuser/.local/lib/python3.8/site-packages/transformers/models/gemma/modeling_gemma.py", line 983, in _update_causal_mask
causal_mask = torch.triu(causal_mask, diagonal=1)
RuntimeError: "triu_tril_cuda_template" not implemented for 'BFloat16'

Here's my solution. alter modeling_gemma.py :

        causal_mask = causal_mask.to(torch.float32)
        causal_mask = torch.triu(causal_mask, diagonal=1)
        causal_mask = causal_mask.to('cuda', dtype=torch.bfloat16)

I am just a beginner in this area. Can you explain it in more detail? I only found causal_mask=torch. triu (causal_mask, radial=1) in line 983 of modeling_gemma. py, and I find it difficult to understand your modification plan. I don't know where to place the modifications 1 and 3. If you can provide a more detailed plan or send the available files directly to xata20010@gmail.com I will be immensely grateful

In the modeling_gemma.py file at line 983 on the top line add
causal_mask = causal_mask.to(torch.float32)

On the next line of line 983 of the original modeling_gemma.file add
causal_mask = causal_mask.to('cuda', dtype=torch.bfloat16)

thanks very much!!

Nexa AI org

Hi, try to use the latest pytorch, and install transformers from the source. I am going through similar problem these days, and I believe they are doing a big upgrading.

Sign up or log in to comment