ValueError: Custom 4D attention mask should be passed in inverted form with max==0

#12
by brian-gordon - opened
Google org

Hi,

I am trying to use the code by running the example code snippet in the Model Card.. but I am receiving the following error:

Traceback (most recent call last):
  File "<stdin>", line 2, in <module>
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/transformers/generation/utils.py", line 1824, in generate
    result = self._sample(
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/transformers/generation/utils.py", line 2463, in _sample
    outputs = self(
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/transformers/models/paligemma/modeling_paligemma.py", line 468, in forward
    outputs = self.language_model(
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 1113, in forward
    outputs = self.model(
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1532, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1541, in _call_impl
    return forward_call(*args, **kwargs)
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 883, in forward
    causal_mask = self._update_causal_mask(
  File "/opt/conda/envs/paligemma-hf/lib/python3.10/site-packages/transformers/models/gemma/modeling_gemma.py", line 1003, in _update_causal_mask
    raise ValueError("Custom 4D attention mask should be passed in inverted form with max==0`")
ValueError: Custom 4D attention mask should be passed in inverted form with max==0`

The shapes of model_inputs are:

input_ids torch.Size([1, 260])
attention_mask torch.Size([1, 260])
pixel_values torch.Size([1, 3, 224, 224])

Details of my environment:

Python version: 3.10.14
CUDA hardware: GPU A100
CUDA version: 12.0
Torch version: 2.3.0
Transformers version: 4.42.0.dev0

Note: Having the same versions, but with a CUDA hardware V100, I don't have this issue

Hi @brian-gordon , I came across the same issue while running paligemma model for vQA. Here https://github.com/huggingface/transformers/issues/31171#issuecomment-2145421881 it is mentioned it is a regression due to a merge which will be fixed. You can solve it for now by pinning the transformers library version to 4.41.1.

Sign up or log in to comment