No output generated with sample code on non-quantised model

by Pwicke - opened Oct 19, 2023

Oct 19, 2023

•

edited Oct 19, 2023

Hi and thanks for this brilliant model.

I have been running your Colab notebook and it works like a charm on Google Colab. I have also tried to reproduce it on my server with 8x NVIDIA RTX A6000. With the exact same code from the notebook I receive the exact same output:

Question: What's on the picture? Answer: Kittens.

But whatever I do, if I do not use the quantised model but idefics-9b or idefics-9b-instruct, I only ever receive:

Question: What's on the picture? Answer:

The only difference between the colab code and my code is the removal of quantization_config=bnb_config from the IdeficsForVisionText2Text.from_pretrained(...) parameter list. I have a had a colleague find their own way of running the model with the code you provided and they have reproduced the exact same issue independently (Question: What's on the picture? Answer:). I've tried different GPUs and different servers, but without the quantised model, I am unable to produce any output. The model loads into memory and is accessed during inference - it just does not generate or return or display any new tokens (I have also increased max_new_tokens=50, tried other prompts like the Pokémon example).

Any help would be appreciated.

VictorSanh

Oct 19, 2023

Hi @Pwicke ,
That does not sound right indeed.
Could you say more about your env? In particular transformers and tokenizers versions?
I'll try to reproduce the error.

Pwicke

Oct 19, 2023

Thank you for your response.

accelerate 0.24.0.dev0, bitsandbytes 0.41.1, nvidia-cublas-cu12 12.1.3.1, python 3.10.12 , sentencepiece 0.1.99 ,tokenizers 0.14.1, torch 2.1.0, transformers 4.35.0.dev0

Pwicke

Nov 16, 2023

•

edited Nov 16, 2023

Could I ask for an update on this? @VictorSanh

TITH

Nov 22, 2023

@Pwicke Have you solved this?

Pwicke

Nov 22, 2023

@TITH unfortunately not. I have to use the 4-bit quantised version. I recently tried the full model again, but still no new tokens are being generated. Do you have the same issue?

TITH

Nov 23, 2023

@Pwicke Yes. But I noticed that using cpu instead of cuda can solve it. Then I switched to torch 2.0.1 and cuda works as well.

Pwicke

Nov 28, 2023

Thanks for the response @TITH . I've tried cpu and it works. But since I also switched to torch 2.0.1, it does no longer use my gpu even though it's specified to do so. Now, I am running my experiment on cpu, which is suboptimal.

WindOcean

Jan 31, 2024

upgrading transformers to 4.37 can solve this problem.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment