'QWenTokenizer' object has no attribute 'IMAGE_ST'

#1
by Neman - opened

I get this error when trying to run example code snippet:
Exception has occurred: AttributeError
'QWenTokenizer' object has no attribute 'IMAGE_ST'
File "/home/neman/.cache/huggingface/modules/transformers_modules/MMInstruction_Silkie/tokenization_qwen.py", line 223, in _add_tokens
if surface_form not in SPECIAL_TOKENS + self.IMAGE_ST:
File "/home/neman/.cache/huggingface/modules/transformers_modules/MMInstruction_Silkie/tokenization_qwen.py", line 120, in init
super().init(**kwargs)
File "/home/neman/PROGRAMMING/PYTHON/QwenVL/SilkieVL_test1.py", line 6, in
tokenizer = AutoTokenizer.from_pretrained(
AttributeError: 'QWenTokenizer' object has no attribute 'IMAGE_ST'

Multi-modal Multilingual Instruction org

Hello @Neman thank you for your interest in Silkie!

Regarding the issue you encountered, could you please provide more information about your environment so we can reproduce the issue? It is strange since QwenTRokenizer does have the attribute IMAGE_ST (see here). It might be related to dependencies. You can find the installation instructions for our environment in our Github repository.

I took a quick look at this, since I ran into the problem myself. The error is because the super().__init__(**kwargs) of QWenTokenizer calls QWenTokenizer._add_tokens(), which requires IMAGE_ST, but the super().__init__() call happens before IMAGE_ST is defined in the initializer, so the _add_tokens() call crashes with an error. Commenting out lines 223 and 224 seems to fix it as a quick hack, but hopefully someone with a better understanding can fix it more properly.

Multi-modal Multilingual Instruction org

I took a quick look at this, since I ran into the problem myself. The error is because the super().__init__(**kwargs) of QWenTokenizer calls QWenTokenizer._add_tokens(), which requires IMAGE_ST, but the super().__init__() call happens before IMAGE_ST is defined in the initializer, so the _add_tokens() call crashes with an error. Commenting out lines 223 and 224 seems to fix it as a quick hack, but hopefully someone with a better understanding can fix it more properly.

Hi @bob80333 I think you are right! After a deeper investigation, I found the issue comes from the refactor of PreTrainedTokenizer in Transformers release 4.34. Since v4.34, self._add_tokens is called in the initializer of PreTrainedTokenizer. We followed Qwen-VL and used transformers < 4.34, which prevented us from encountering this issue. I suggest using the same version to reproduce our experiments, or customizing tokenization_qwen.py as suggested by @bob80333 .

Thank you both. Bob's suggestion solved it.
I did few tests with random photos and compared with Qwen-VL-Chat-Int4. It is little bit better (need more testing). I should have compared with not quantized Qwen, but just to report.

Sign up or log in to comment