Text Generation
Transformers
Safetensors
English
llama
text-generation-inference
4-bit precision

YiTokenizer does not exist or is not currently imported.

#1
by iChrist - opened

I have a functional oobabooga install, with GPTQ working great.
Tried to run this model, installed from the model tab, and I am getting this error:

2023-11-14 12:27:30 INFO:Loading TheBloke_dolphin-2_2-yi-34b-AWQ...
Replacing layers...: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 60/60 [00:05<00:00, 10.25it/s]
Fusing layers...: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 60/60 [00:07<00:00, 7.81it/s]
2023-11-14 12:27:51 ERROR:Failed to load the model.
Traceback (most recent call last):
File "D:\TextGen\modules\ui_model_menu.py", line 210, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name, loader)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\TextGen\modules\models.py", line 93, in load_model
tokenizer = load_tokenizer(model_name, model)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\TextGen\modules\models.py", line 113, in load_tokenizer
tokenizer = AutoTokenizer.from_pretrained(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "D:\TextGen\installer_files\env\Lib\site-packages\transformers\models\auto\tokenization_auto.py", line 765, in from_pretrained
raise ValueError(
ValueError: Tokenizer class YiTokenizer does not exist or is not currently imported.

Also tried taking the original tokenizer from https://huggingface.co/ehartford/dolphin-2_2-yi-34b/resolve/main/tokenization_yi.py
And tried using the GPTQ tokenization_yi.py file and it didnt help (GPTQ works fine tho)

made a demo with the original model, but had the same problem which i solved & you can find it here : https://huggingface.co/spaces/Tonic1/YiTonic/tree/main just check how the tokenizer issue is handled if you like

made a demo with the original model, but had the same problem which i solved & you can find it here : https://huggingface.co/spaces/Tonic1/YiTonic/tree/main just check how the tokenizer issue is handled if you like

Thanks for the help!
Should I also pip install -r and your requirement file? It will downgrade to cu113?

Tried to copy your files and got this error:

Traceback (most recent call last):

File "D:\TextGen\modules\ui_model_menu.py", line 210, in load_model_wrapper

shared.model, shared.tokenizer = load_model(shared.model_name, loader)

                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\TextGen\modules\models.py", line 85, in load_model

output = load_func_maploader

     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

File "D:\TextGen\modules\models.py", line 299, in AutoAWQ_loader

from awq import AutoAWQForCausalLM
File "D:\TextGen\installer_files\env\Lib\site-packages\awq_init_.py", line 2, in

from awq.models.auto import AutoAWQForCausalLM
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\models_init_.py", line 1, in

from .mpt import MptAWQForCausalLM
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\models\mpt.py", line 1, in

from .base import BaseAWQForCausalLM
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\models\base.py", line 12, in

from awq.quantize.quantizer import AwqQuantizer
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\quantize\quantizer.py", line 11, in

from awq.modules.linear import WQLinear_GEMM, WQLinear_GEMV
File "D:\TextGen\installer_files\env\Lib\site-packages\awq\modules\linear.py", line 4, in

import awq_inference_engine # with CUDA kernels

^^^^^^^^^^^^^^^^^^^^^^^^^^^
ImportError: DLL load failed while importing awq_inference_engine: The specified module could not be found.

iChrist changed discussion status to closed
iChrist changed discussion status to open

Make sure you load with trust_remote_code=True

I get this error when I try to load the model:

File "/Downloads/text-generation-webui/modules/models.py", line 85, in load_model
output = load_func_maploader
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Downloads/text-generation-webui/modules/models.py", line 299, in AutoAWQ_loader
from awq import AutoAWQForCausalLM
ModuleNotFoundError: No module named 'awq'

Make sure you load with trust_remote_code=True

Thanks, it helped.
But for some reason the GPTQ version of this model gives much better results, while this AWQ version gives random words in repeat.
Same template ChatML same context.
And, it seems the GPTQ is slightly faster..

But for some reason the GPTQ version of this model gives much better results, while this AWQ version gives random words in repeat.
Same template ChatML same context.
And, it seems the GPTQ is slightly faster..

I have had this experience as well, with multiple models... Maybe AutoAWQ needs some tweaking?

tokenizer = AutoTokenizer.from_pretrained(model_name_or_path, trust_remote_code=True)

Sign up or log in to comment