How I got this to run with oobabooga/ text-generation-webui

#37
by socter - opened

So there have been a few threads on this and none of them had a conclusive answer so I thought I would mention how I got this running.

  1. Download the model from the text-generation-webui root:
 python download-model.py anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g --output /path/to/your/output/folder
  1. Download cuda branch of GPTQ-for-LLaMa
git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git -b cuda
pip install -r requirements.txt
python setup_cuda.py install
export PYTHON_PATH=$PYTHON_PATH:/path/to/GPTQ-for-LLaMa
  1. Start web ui
python server.py --wbits 4 --model-dir /path/to/your/output/folder --model_type LLaMa
  1. Rename the non-cuda model so that it picks the right one. You can remove it or rename it to anything else as long as when you load the model it selects the one with *-cuda.pt
mv /path/to/your/output/folder/anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g/gpt-x-alpaca-13b-native-4bit-128g.pt /path/to/your/output/folder/anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g/0gpt-x-alpaca-13b-native-4bit-128g.pt

Now go to the models tab in the web UI and load the anon8231489123/gpt4-x-alpaca-13b-native-4bit-128g

Thanks! that works. Note that in latest oobabooga seems to have the server.py call is in webui.py as run_cmd("python server.py ...

is it better then vicuna?

I flowed your command, but I still meet some problem, like:

Traceback (most recent call last):
File “/home/ecs-user/Chatgpt/server.py”, line 101, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name)
File “/home/ecs-user/Chatgpt/modules/models.py”, line 229, in load_model
tokenizer = LlamaTokenizer.from_pretrained(Path(f"{shared.args.model_dir}/{model_name}/"), clean_up_tokenization_spaces=True)
File “/home/ecs-user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers-4.29.0.dev0-py3.10.egg/transformers/tokenization_utils_base.py”, line 1812, in from_pretrained
return cls._from_pretrained(
File “/home/ecs-user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers-4.29.0.dev0-py3.10.egg/transformers/tokenization_utils_base.py”, line 1975, in _from_pretrained
tokenizer = cls(*init_inputs, **init_kwargs)
File “/home/ecs-user/miniconda3/envs/textgen/lib/python3.10/site-packages/transformers-4.29.0.dev0-py3.10.egg/transformers/models/llama/tokenization_llama.py”, line 96, in init
self.sp_model.Load(vocab_file)
File “/home/ecs-user/.local/lib/python3.10/site-packages/sentencepiece/init.py”, line 905, in Load
return self.LoadFromFile(model_file)
File “/home/ecs-user/.local/lib/python3.10/site-packages/sentencepiece/init.py”, line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

Firstly, I simply think it just about my transformers version problem, so I changed the version 4.28.0-->4.29.0, but the problem still exist.
T_T

Traceback (most recent call last):
File “C:\Users\johnm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\transformers\configuration_utils.py”, line 658, in _get_config_dict
config_dict = cls._dict_from_json_file(resolved_config_file)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\johnm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\transformers\configuration_utils.py”, line 745, in _dict_from_json_file
text = reader.read()
^^^^^^^^^^^^^
File “”, line 322, in decode
UnicodeDecodeError: ‘utf-8’ codec can’t decode byte 0xce in position 4411: invalid continuation byte

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File “C:\Users\johnm\Downloads\oobabooga_windows\oobabooga_windows\text-generation-webui\server.py”, line 102, in load_model_wrapper
shared.model, shared.tokenizer = load_model(shared.model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\johnm\Downloads\oobabooga_windows\oobabooga_windows\text-generation-webui\modules\models.py”, line 71, in load_model
shared.model_type = find_model_type(model_name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\johnm\Downloads\oobabooga_windows\oobabooga_windows\text-generation-webui\modules\models.py”, line 59, in find_model_type
config = AutoConfig.from_pretrained(Path(f’{shared.args.model_dir}/{model_name}'))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\johnm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\transformers\models\auto\configuration_auto.py”, line 916, in from_pretrained
config_dict, unused_kwargs = PretrainedConfig.get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\johnm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\transformers\configuration_utils.py”, line 573, in get_config_dict
config_dict, kwargs = cls._get_config_dict(pretrained_model_name_or_path, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File “C:\Users\johnm\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.11_qbz5n2kfra8p0\LocalCache\local-packages\Python311\site-packages\transformers\configuration_utils.py”, line 661, in _get_config_dict
raise EnvironmentError(
OSError: It looks like the config file at ‘C:\Users\johnm\Downloads\oobabooga_windows\oobabooga_windows\text-generation-webui\models\anon8231489123_gpt4-x-alpaca-13b-native-4bit-128g\tokenizer.model’ is not a valid JSON file.

Sign up or log in to comment