Problem when trying to convert
First: thank you. This looks like a really interesting project. I can't wait to try it out!
I tried to convert it to GGUF with llama.cpp and got the following error:
~/llama.cpp/convert.py deepmoney-67b-chat
Loading model file deepmoney-67b-chat/pytorch_model-00001-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00001-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00002-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00003-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00004-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00005-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00006-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00007-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00008-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00009-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00010-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00011-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00012-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00013-of-00014.bin
Loading model file deepmoney-67b-chat/pytorch_model-00014-of-00014.bin
params = Params(n_vocab=102400, n_embd=8192, n_layer=95, n_ctx=4096, n_ff=22016, n_head=64, n_head_kv=8, n_experts=None, n_experts_used=None, f_norm_eps=1e-06, rope_scaling_type=None, f_rope_freq_base=10000.0, f_rope_scale=None, n_orig_ctx=None, rope_finetuned=None, ftype=<GGMLFileType.MostlyQ8_0: 7>, path_model=PosixPath('deepmoney-67b-chat'))
Found vocab files: {'tokenizer.model': None, 'vocab.json': None, 'tokenizer.json': PosixPath('deepmoney-67b-chat/tokenizer.json')}
Loading vocab file 'deepmoney-67b-chat/tokenizer.json', type 'spm'
Traceback (most recent call last):
File "/Users/ubr/llm/llama.cpp/convert.py", line 1471, in
main()
File "/Users/ubr/llm/llama.cpp/convert.py", line 1439, in main
vocab, special_vocab = vocab_factory.load_vocab(args.vocab_type, model_parent_path)
File "/Users/ubr/llm/llama.cpp/convert.py", line 1325, in load_vocab
vocab = SentencePieceVocab(
File "/Users/ubr/llm/llama.cpp/convert.py", line 391, in init
self.sentencepiece_tokenizer = SentencePieceProcessor(str(fname_tokenizer))
File "/Users/ubr/Library/Python/3.9/lib/python/site-packages/sentencepiece/init.py", line 447, in Init
self.Load(model_file=model_file, model_proto=model_proto)
File "/Users/ubr/Library/Python/3.9/lib/python/site-packages/sentencepiece/init.py", line 905, in Load
return self.LoadFromFile(model_file)
File "/Users/ubr/Library/Python/3.9/lib/python/site-packages/sentencepiece/init.py", line 310, in LoadFromFile
return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
RuntimeError: Internal: /Users/runner/work/sentencepiece/sentencepiece/src/sentencepiece_processor.cc(1102) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
Do you have any idea? I think it might be related to tokenizer.json. I also get the following error when trying to convert it to tokenizer model:
Exception: Vocab size mismatch (model has 102400, but deepmoney-67b-chat/tokenizer.json has 100015)
When I pad it with --pad-vocab I get the error from above again.
me 2, same error
Someone uploaded heavy quantized versions of this model:
deepmoney-xs.gguf
deepmoney-xxs.gguf
I haven't yet tried them myself.
Try using the --vocab-type bpe
option for convert.py
. IIRC this was needed for deepseek-coder
as it incorrectly guesses the vocab to be spm
by default.