Slow to load tokenizer

#2
by gptzerozero - opened

Anyone notice it takes a long time (2 minutes) to load the tokenizer for this GPTQ model, but other GPTQ models like TheBloke/WizardLM-33B-V1-0-Uncensored-SuperHOT-8K-GPTQ loads within a second (100ms).

model_id = path_to_downloaded_models/TheBloke_LongChat-13B-GPTQ
tokenizer = AutoTokenizer.from_pretrained(model_id, use_fast=True) 

Oh, I never uploaded a fast tokenizer for this. I'll sort that out now

Done, trigger a download of the model again and it'll download tokenizer.json and then it will load instantly.

Hello, is there 8bit version for gptq?

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment