onnx-community/paligemma2-3b-pt-224 · PaliGemma 2 ONNX doesn't support object detection?

Dec 26, 2024

•

edited Dec 26, 2024

Hi, thanks for sharing the ONNX weights for PaliGemma 2. While it works well for image captioning, I tried several prompts for object detection using the detect keyword in the prompt.
Eg: detect person was one of the prompts, but the response was null.

Are the converted model weights compatible only with captioning tasks?

Xenova

ONNX Community org Dec 26, 2024

Hmm, it should work. Could you share the code you are using?

Xenova

ONNX Community org Dec 26, 2024

Also, can you confirm the original (pytorch) version works correctly for your image/prompt?

NSTiwari

Dec 26, 2024

@Xenova :Okay, after experimenting with various different prompts, I was able to get the bounding box coordinates. Unlike the original PaliGemma 2 weights where a simple <image>detect person would work, I had to specifically provide this prompt <image>detect bounding box of person to make it work.

NSTiwari

Dec 29, 2024

•

edited Dec 29, 2024

Hi @Xenova , is it possible to run this using Vanilla JS by loading Transformers.js via a CDN?
I get the following error:

import { AutoProcessor, PaliGemmaForConditionalGeneration } from 'https://cdn.jsdelivr.net/npm/@huggingface/transformers@3.2.4';

Here's how I'm loading it.

biswajitdevsarma

ONNX Community org Feb 10

How are you converting the model to onnx? Optimum is not supporting image-text-to-text task. Please help.

$optimum-cli export onnx --model google/paligemma-3b-pt-224 paligemma-3b-pt-224_onnx/
KeyError: “Unknown task: image-text-to-text.

I tried specifying one of the existing task image-to-text. But that throws also another error
$optimum-cli export onnx --model google/paligemma-3b-pt-224 --task image-to-text paligemma-3b-pt-224_onnx/

ValueError: Trying to export a paligemma model, that is a custom or unsupported architecture, but no custom onnx configuration was passed as custom_onnx_configs. Please refer to Export a model to ONNX with optimum.exporters.onnx for an example on how to export custom models. Please open an issue at GitHub · Where software is built if you would like the model type paligemma to be supported natively in the ONNX export.

Xenova

ONNX Community org Feb 10

paligemma2 uses a custom conversion script, which I have added here: https://github.com/huggingface/transformers.js/issues/1126#issuecomment-2575525385

Hope that helps!

NSTiwari

Feb 11

@Xenova : I've commented on the GitHub issue about an error. Could you please check?

RuntimeError: The serialized model is larger than the 2GiB limit imposed by the protobuf library. Therefore the output file must be a file path, so that the ONNX external data can be written to the same directory. Please specify the output file name.

biswajitdevsarma

ONNX Community org Feb 11

@Xenova Thanks . That helps.

NSTiwari

Feb 11

@biswajitdevsarma : Did it work for you?

biswajitdevsarma

ONNX Community org Feb 11

@NSTiwari Conversion to onnx worked. Haven't checked inference using onnx yet.

NSTiwari

Feb 11

@biswajitdevsarma : Do you mind sharing the notebook? When I tried doing the same, I got the above error.

biswajitdevsarma

ONNX Community org Feb 11

•

edited Feb 11

@NSTiwari I used the above code
Just commented the onnx.slim part

Attempt to optimize the model with onnxslim

"""
try:
    onnx_model = onnxslim.slim(temp_model_path)
except Exception as e:
    print(f"Failed to slim {temp_model_path}: {e}")
    onnx_model = onnx.load(temp_model_path)
"""
onnx_model = onnx.load(temp_model_path))

Everything else is same.

NSTiwari

Feb 11

•

edited Feb 11

@biswajitdevsarma
I used the same code too. Maybe, I'm missing some dependencies or compatibility issues with versions. Here's my notebook. Could you please check once? Really appreciate your help.

biswajitdevsarma

ONNX Community org Feb 13

•

edited Feb 13

@NSTiwari
I installed the following packages in python 3.10 environment
pip install -q --upgrade git+https://github.com/huggingface/transformers.git
pip install -q datasets lightning
pip install -q peft accelerate bitsandbytes
pip install -q --upgrade wandb
pip install Pillow
pip install tensorboardX
npm i @huggingface/transformers

biswajitdevsarma

ONNX Community org Feb 13

•

edited Feb 13

The inference in code https://huggingface.co/onnx-community/paligemma2-3b-pt-224 doesn't work when I use a local model path
const model_path = './my_local_onnx_model';

Error: Unauthorized access to file: "https://huggingface.co/./my_local_onnx_model/resolve/main/preprocessor_config.json".

Any idea how to make it work with local model path?

NSTiwari

Feb 13

@NSTiwari
I installed the following packages in python 3.10 environment
pip install -q --upgrade git+https://github.com/huggingface/transformers.git
pip install -q datasets lightning
pip install -q peft accelerate bitsandbytes
pip install -q --upgrade wandb
pip install Pillow
pip install tensorboardX
npm i @huggingface/transformers

@biswajitdevsarma
What about the ONNX libraries?

biswajitdevsarma

ONNX Community org Feb 13

@NSTiwari oh yes
pip install optimum[exporters]
pip install onnxslim

biswajitdevsarma

ONNX Community org Feb 13

•

edited Feb 13

The inference code in https://huggingface.co/onnx-community/paligemma2-3b-pt-224 doesn't work when I use a local model path
const model_path = './my_local_onnx_model';

Error: Unauthorized access to file: "https://huggingface.co/./my_local_onnx_model/resolve/main/preprocessor_config.json".

Any idea how to make it work with local model path?

NSTiwari

Feb 14

The inference code in https://huggingface.co/onnx-community/paligemma2-3b-pt-224 doesn't work when I use a local model path
const model_path = './my_local_onnx_model';

Error: Unauthorized access to file: "https://huggingface.co/./my_local_onnx_model/resolve/main/preprocessor_config.json".

Any idea how to make it work with local model path?

@biswajitdevsarma : I can try at my end, but I'm not even able to convert to ONNX first. I still get the same error:

Have you converted it on your local machine or used Colab?

Xenova

ONNX Community org Feb 14

@NSTiwari that is unfortunately a bug with pytorch that affects some environments. If you could open an issue on their GitHub, that would be great!

You can override this by going to the source code and adding GLOBALS.onnx_shape_inference = False just before that line.

NSTiwari

Feb 15

@Xenova Tried overriding by adding GLOBALS.onnx_shape_inference = False, but that didn't help either.

Xenova

ONNX Community org Feb 15

•

edited Feb 15

When running in Google Colab, remember to restart the runtime after making the change.

The error pointing to the setting of the global variable (instead of the lines below it) suggests this is the case.

NSTiwari

Feb 15

•

edited Feb 16

Thanks, @Xenova . This partially worked. However, only the following 5 files are generated as opposed to all the other files from the official onnx-community repo:

Do I need to specify the quantization flag to get q8, q4, and fp16 files? If yes, where?
Or do I first need to quantize PaliGemma 2 using bnb and then go ahead with the normal ONNX conversion process?

The execution stopped with the below message:

/usr/local/lib/python3.11/dist-packages/transformers/models/gemma2/modeling_gemma2.py:625: TracerWarning: Converting a tensor to a Python number might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  attention_mask.shape[-1] if attention_mask.dim() == 2 else cache_position[-1].item()
/usr/local/lib/python3.11/dist-packages/transformers/models/gemma2/modeling_gemma2.py:640: TracerWarning: torch.tensor results are registered as constants in the trace. You can safely ignore this warning if you use this function to create tensors out of constant variables that would be the same every time you call this function. In any other case, this might cause the trace to be incorrect.
  normalizer = torch.tensor(self.config.hidden_size**0.5, dtype=hidden_states.dtype)
/usr/local/lib/python3.11/dist-packages/transformers/models/gemma2/modeling_gemma2.py:294: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  effective_seq_len = max(cache_position.shape[0], self.sliding_window)
Failed to slim output/google/paligemma2-3b-pt-224/temp/decoder_model_merged.onnx: Error parsing message
Failed to slim output/google/paligemma2-3b-pt-224/temp/embed_tokens.onnx: Error parsing message

NSTiwari

Feb 16

Thank you, @Xenova . I'm finally able to convert, and quantize PaliGemma 2 to ONNX. Thanks for all the help.