tokenizer.model
Could you upload a tokenizer.model file so we can make ggufs of fine-tunes? Thanks
Why don‘t we provide the tokenizer.model file for model quantization?
DeepSeek LLM utilizes the HuggingFace Tokenizer to implement the Byte-level BPE algorithm, with specially designed pre-tokenizers to ensure optimal performance. Currently, there is no direct way to convert the tokenizer into a SentencePiece tokenizer. We are contributing to the open-source quantization methods facilitate the usage of HuggingFace Tokenizer.
GGUF(llama.cpp)
We have submitted a PR to the popular quantization repository llama.cpp to fully support all HuggingFace pre-tokenizers, including ours.
While waiting for the PR to be merged, you can generate your GGUF model using the following steps:
git clone https://github.com/DOGEwbx/llama.cpp.git
cd llama.cpp
git checkout regex_gpt2_preprocess
# set up the environment according to README
make
python3 -m pip install -r requirements.txt
# generate GGUF model
python convert-hf-to-gguf.py <MODEL_PATH> --outfile <GGUF_PATH> --model-name deepseekllm
# use q4_0 quantization as an example
./quantize <GGUF_PATH> <OUTPUT_PATH> q4_0
./main -m <OUTPUT_PATH> -n 128 -p <PROMPT>
GPTQ(exllamav2)
UPDATE:
exllamav2 has been able to support HuggingFace Tokenizer. Please pull the latest version and try out.