No merges.txt file with text generation webui

#7
by ranguna - opened

When loading this model through web text generation webui, I'm getting the following error:

2023-07-09 00:52:17 ERROR:Failed to load the model.
Traceback (most recent call last):
  File "$HOME/projects/ai/text-generation-webui/server.py", line 68, in load_model_wrapper
    shared.model, shared.tokenizer = load_model(shared.model_name, loader)
  File "$HOME/projects/ai/text-generation-webui/modules/models.py", line 87, in load_model
    tokenizer = load_tokenizer(model_name, model)
  File "$HOME/projects/ai/text-generation-webui/modules/models.py", line 104, in load_tokenizer
    tokenizer = AutoTokenizer.from_pretrained(
  File "$HOME/.miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py", line 691, in from_pretrained
    return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)
  File "$HOME/.miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1825, in from_pretrained
    return cls._from_pretrained(
  File "$HOME/.miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/tokenization_utils_base.py", line 1989, in _from_pretrained
    tokenizer = cls(*init_inputs, **init_kwargs)
  File "$HOME/.miniconda3/envs/textgen/lib/python3.10/site-packages/transformers/models/gpt2/tokenization_gpt2.py", line 195, in __init__
    with open(merges_file, encoding="utf-8") as merges_handle:
TypeError: expected str, bytes or os.PathLike object, not NoneType

Logging call arguments passed to tokenization_utils_base.py, we can see that the merges file is indeed missing:

{'add_prefix_space': False, 'additional_special_tokens': ['<|endoftext|>', '<fim_prefix>', '<fim_middle>', '<fim_suffix>', '<fim_pad>', '<filename>', '<gh_stars>', '<issue_start>', '<issue_comment>', '<issue_closed>', '<jupyter_start>', '<jupyter_text>', '<jupyter_code>', '<jupyter_output>', '<empty_output>', '<commit_before>', '<commit_msg>', '<commit_after>', '<reponame>'], 'bos_token': '<|endoftext|>', 'clean_up_tokenization_spaces': True, 'eos_token': '<|endoftext|>', 'model_max_length': 1000000000000000019884624838656, 'unk_token': '<|endoftext|>', 'vocab_size': 49152, 'vocab_file': 'models/TheBloke_starchat-beta-GPTQ/vocab.json', 'merges_file': None, 'special_tokens_map_file': 'models/TheBloke_starchat-beta-GPTQ/special_tokens_map.json', 'name_or_path': 'models/TheBloke_starchat-beta-GPTQ'}

Is this file missing from this model or is this something with my local setup ?

Sorry that was my fault. merges.txt is uploaded now. Trigger another download of the repo and it will download the missing file and then it should work

No worries

It's working perfectly now, thanks for the awesome work!

Btw, it seems the merges.txt file is also missing in this model: https://huggingface.co/TheBloke/starcoderplus-GPTQ

Would you like me to open a discussion there ?

OK thanks, that's fixed now too.

Amazing. Thanks again!

ranguna changed discussion status to closed

Sign up or log in to comment