Error running idefics2-8b-AWQ

by oliverguhr - opened Apr 16, 2024

Apr 16, 2024

Hello,
I tried to run the demo code given in the model card. However, I get the following error:

...
/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py in convert(t)
   1148                 return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None,
   1149                             non_blocking, memory_format=convert_to_format)
-> 1150             return t.to(device, dtype if t.is_floating_point() or t.is_complex() else None, non_blocking)
   1151 
   1152         return self._apply(convert
NotImplementedError: Cannot copy out of meta tensor; no data!

You can reproduce this error with this colab:

https://colab.research.google.com/drive/1sjpD6sMGQEPx2KaGC-Oa9WmxO_sii0wB?usp=sharing

hafidzwibowo

Apr 16, 2024

i have the same issue

VictorSanh

Apr 16, 2024

Hi!
Don't know whether that's related, but I added this link (https://github.com/casper-hansen/AutoAWQ/pull/444) for a necessary fix to run idefics2-8b-awq. i inaverdantly did not put the link in the code snippet initially

hafidzwibowo

Apr 16, 2024

Hi!
Don't know whether that's related, but I added this link (https://github.com/casper-hansen/AutoAWQ/pull/444) for a necessary fix to run idefics2-8b-awq. i inaverdantly did not put the link in the code snippet initially

Im trying to use pip install directly from this source+pull, but it seems no change in the error, could you provide the running snippet, or wee need just to wait for the PR merged?

jzchai

Apr 17, 2024

I am not sure if the PR will help.
I successfully load the model and get rid of the "NotImplementedError: Cannot copy out of meta tensor; no data!" error by using device_map="auto".

model = AutoModelForVision2Seq.from_pretrained(
    "HuggingFaceM4/idefics2-8b-AWQ",
    quantization_config=quantization_config,
    device_map="auto",
)#.to(DEVICE)

However, during the generation, I have another error.

TypeError: QuantAttentionFused.forward() missing 1 required positional argument: 'hidden_states'

VictorSanh

Apr 17, 2024

hi @oliverguhr and @jzchai i was able to reproduce your reports. digging in right now. it is too bad you can't run the awq quantized versions

oliverguhr

Apr 17, 2024

@VictorSanh Thanks for looking into this.

knoopx

Apr 17, 2024

•

edited Apr 17, 2024

Is this available at all? Transformers don't even recognize the arch.
ValueError: The checkpoint you are trying to load has model type idefics2 but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.

uv pip freeze | grep transformers
transformers==4.39.3

VictorSanh

Apr 17, 2024

@knoopx please install transformers from main! idefics2 is not part of a pypi release yet :)

VictorSanh

Apr 17, 2024

@oliverguhr @jzchai , can you try again? i fixed something and I can run @oliverguhr original colab with no error (I commented out the second time the model was loaded though haha)

the fix I applied: https://huggingface.co/HuggingFaceM4/idefics2-8b-AWQ/commit/b56a9458693297bdd1686e4f7ad1103393d456fc
I forgot to rename some attributes after we change the name of some modules

knoopx

Apr 17, 2024

@VictorSanh apparently working on my side, but responses seem quite random, let me play with temps/prompts

oliverguhr

Apr 18, 2024

Hi @VictorSanh thanks for the fix!

Now I get an CUDA OOM using the standard colab T4 GPU. The 4 Bit quantized model should fit into 15GB of T4, right?

knoopx

Apr 18, 2024

@oliverguhr taking up to 17gb on my PC. As stated in the readme the image encoder is memory hungry.

kundeshwar20

Apr 18, 2024

i am getiing this error please tell me how to solve it : ValueError: The checkpoint you are trying to load has model type idefics2 but Transformers does not recognize this architecture. This could be because of an issue with the checkpoint, or because your version of Transformers is out of date.
I am getting this error i am trying to debugg it;i have updated transformrers but still the error prsists

amyeroberts

HuggingFaceM4 org Apr 18, 2024

@kundeshwar20 What version of transformers are you running on? The model is only available on the development branch at the moment. You can install from source to get it: pip install git+https://github.com/huggingface/transformers. We'll be doing a release soon, which will include the model in a stable release which can be directly used from a pypi install

kundeshwar20

Apr 18, 2024

Name: transformers
Version: 4.39.0

kundeshwar20

Apr 18, 2024

but still not working

jzchai

Apr 18, 2024

•

edited Apr 18, 2024

@oliverguhr @jzchai , can you try again? i fixed something and I can run @oliverguhr original colab with no error (I commented out the second time the model was loaded though haha)

the fix I applied: https://huggingface.co/HuggingFaceM4/idefics2-8b-AWQ/commit/b56a9458693297bdd1686e4f7ad1103393d456fc
I forgot to rename some attributes after we change the name of some modules

@VictorSanh Thanks for the fix. It's working fine for me now. The output seems nice with the default generation settings. But, it's not very chatty.

VictorSanh

Apr 18, 2024

@oliverguhr @jzchai , can you try again? i fixed something and I can run @oliverguhr original colab with no error (I commented out the second time the model was loaded though haha)
the fix I applied: https://huggingface.co/HuggingFaceM4/idefics2-8b-AWQ/commit/b56a9458693297bdd1686e4f7ad1103393d456fc
I forgot to rename some attributes after we change the name of some modules

@VictorSanh Thanks for the fix. It's working fine for me now. The output seems nice with the default generation settings. But, it's not very chatty.

I'm glad to hear this! the sft model is not optimized for chatty scenarios, please stay tuned for the idefics2-8b-chatty!

kundeshwar20

Apr 19, 2024

hello, can anyone provide me a google colab link to check my data outputs. i am not able to run in my system.

VictorSanh

Apr 19, 2024

hi @kundeshwar20
the code snippet https://huggingface.co/HuggingFaceM4/idefics2-8b#technical-summary provides the expected outputs. are you able to double-check against these?

VictorSanh

Apr 19, 2024

@oliverguhr et al,
i updated the section https://huggingface.co/HuggingFaceM4/idefics2-8b#model-optimizations to include some benchmarks on how to run idefics2 with very little GPU memory.
TLDR: there are plenty of low-lift setups that require less than 16GB GPU memory to run inference.

kundeshwar20

Apr 22, 2024

•

edited Apr 22, 2024

is this useful for hindi text image extraction

VictorSanh

Apr 22, 2024

is this useful for hindi text image extraction

Hi @kundeshwar20
we have primarily trained and evaluated on english so I would encourage you to run your own hindi tests and possibly fine-tune on hindi ocr data.
You should have a lot of the tools already available here and on hf in general to do so! let me know if you need guidance!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment