tokenizer.model failing to load
#2
by
tylerdev
- opened
Hi, I see you just updated this to include a tokenizer model.
It seems to be causing some issues with sentencepiece when trying to load the model. My error is below:
2024-02-13T00:16:16.475613939Z self.tokenizer = ExLlamaV2Tokenizer(config)
2024-02-13T00:16:16.475615049Z File "/usr/local/lib/python3.10/dist-packages/exllamav2/tokenizer.py", line 65, in __init__
2024-02-13T00:16:16.475616229Z if os.path.exists(path_spm) and not force_json: self.tokenizer = ExLlamaV2TokenizerSPM(path_spm)
2024-02-13T00:16:16.475617419Z File "/usr/local/lib/python3.10/dist-packages/exllamav2/tokenizers/spm.py", line 9, in __init__
2024-02-13T00:16:16.475618739Z self.spm = SentencePieceProcessor(model_file = tokenizer_model)
2024-02-13T00:16:16.475619839Z File "/usr/local/lib/python3.10/dist-packages/sentencepiece/__init__.py", line 447, in Init
2024-02-13T00:16:16.475620919Z self.Load(model_file=model_file, model_proto=model_proto)
2024-02-13T00:16:16.475622029Z File "/usr/local/lib/python3.10/dist-packages/sentencepiece/__init__.py", line 905, in Load
2024-02-13T00:16:16.475623039Z return self.LoadFromFile(model_file)
2024-02-13T00:16:16.475624149Z File "/usr/local/lib/python3.10/dist-packages/sentencepiece/__init__.py", line 310, in LoadFromFile
2024-02-13T00:16:16.475625379Z return _sentencepiece.SentencePieceProcessor_LoadFromFile(self, arg)
2024-02-13T00:16:16.475626769Z RuntimeError: Internal: src/sentencepiece_processor.cc(1101) [model_proto->ParseFromArray(serialized.data(), serialized.size())]
Any ideas on what might be causing this? I confirmed that the tokenizer model downloaded so not sure what to try.
Anyway, great model!! Thanks for publishing this.
That is because I failed to upload the right one.
Should be fixed now, let me know if it isn't!
That worked, thanks!
tylerdev
changed discussion status to
closed