Problem when trying to convert

#1
by ubr - opened

First: thank you. This looks like a really interesting project. I can't wait to try it out!

I tried to convert it to GGUF with llama.cpp and got the following error:

~/llama.cpp/convert.py deepmoney-67b-chat

Loading model file deepmoney-67b-chat/pytorch_model-00001-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00001-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00002-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00003-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00004-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00005-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00006-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00007-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00008-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00009-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00010-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00011-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00012-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00013-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00014-of-00014.bin
params = Params(n_vocab=102400, n_embd=8192, n_layer=95, n_ctx=4096, n_ff=22016, n_head=64, n_head_kv=8, n_experts=None, n_experts_used=None, f_norm_eps=1e-06, rope_scaling_type=None, f_rope_freq_base=10000.0, f_rope_scale=None, n_orig_ctx=None, rope_finetuned=None, ftype=<GGMLFileType.MostlyQ8_0: 7>, path_model=PosixPath('deepmoney-67b-chat'))
Found vocab files: {'tokenizer.model': None, 'vocab.json': None, 'tokenizer.json': PosixPath('deepmoney-67b-chat/tokenizer.json')}
Loading vocab file 'deepmoney-67b-chat/tokenizer.json', type 'spm'
Traceback (most recent call last):
File "/Users/ubr/llm/llama.cpp/convert.py", line 1471, in
main()
File "/Users/ubr/llm/llama.cpp/convert.py", line 1439, in main
vocab, special_vocab = vocab_factory.load_vocab(args.vocab_type, model_parent_path)
File "/Users/ubr/llm/llama.cpp/convert.py", line 1325, in load_vocab
vocab = SentencePieceVocab(
File "/Users/ubr/llm/llama.cpp/convert.py", line 391, in init
self.sentencepiece_tokenizer = SentencePieceProcessor(str(fname_tokenizer))
File "/Users/ubr/Library/Python/3.9/lib/python/site-packages/sentencepiece/init.py", line 447, in Init
self.Load(model_file=model_file, model_proto=model_proto)
File "/Users/ubr/Library/Python/3.9/lib/python/site-packages/sentencepiece/init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/Users/ubr/Library/Python/3.9/lib/python/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: /Users/runner/work/sentencepiece/sentencepiece/src/sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]

Do you have any idea? I think it might be related to tokenizer.json. I also get the following error when trying to convert it to tokenizer model:

Exception: Vocab size mismatch (model has 102400, but deepmoney-67b-chat/tokenizer.json has 100015)

When I pad it with --pad-vocab I get the error from above again.

me 2, same error

Someone uploaded heavy quantized versions of this model:

deepmoney-xs.gguf
deepmoney-xxs.gguf

I haven't yet tried them myself.

Try using the --vocab-type bpe option for convert.py. IIRC this was needed for deepseek-coderas it incorrectly guesses the vocab to be spmby default.

Sign up or log in to comment