AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'

#87
by mobbsen-aiq - opened
base_model = "THUDM/chatglm2-6b"
---> tokenizer = AutoTokenizer.from_pretrained(base_model, trust_remote_code=True)
~/.cache/huggingface/modules/transformers_modules/THUDM/chatglm2-6b/8fd7fba285f7171d3ae7ea3b35c53b6340501ed1/tokenization_chatglm.py in vocab_size(self)
    106     @property
    107     def vocab_size(self):
--> 108         return self.tokenizer.n_words
    109 
    110     def get_vocab(self):

AttributeError: 'ChatGLMTokenizer' object has no attribute 'tokenizer'

This problem seem to only appear today and was totally fine.
Much appreciation if someone could verify the issue.
I'm on transformers-4.34.0

Reverting to transformers==4.33.0 resolved the problem.

Loading checkpoint shards becomes very slow, do you encounter this issue?

Loading checkpoint shards becomes very slow, do you encounter this issue?

I've just started using this model for a week or so, not sure if it's getting slower than before, but I do found it's relatively slow when being loaded from Google Drive into a Colab Notebook.

This might because the code is not compatible with huggingface class and method defs.

moving line "self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) " before "super().init(" at "init" of class ChatGLMTokenizer can solve this issue.

moving line "self.sp_tokenizer = SPTokenizer(vocab_file, num_image_tokens=num_image_tokens) " before "super().init(" at "init" of class ChatGLMTokenizer can solve this issue.

Emm, I can't find this code in tokenization_chatglm.py

Sign up or log in to comment