Possible to upload Q4_K_M?
#1
by
EricTri
- opened
Hi,
Thanks for uploading this! I am currently limited on storage space and cant spare the 3TB needed to generate the Q4 optimized model. Would be much appreciated!
Thanks
EricTri
changed discussion title from
Possible to uplaod Q4_K_M?
to Possible to upload Q4_K_M?
Unfortunately, I no longer have access to the machine. But if you can spare more than 2TB here is how I would do it:
export WORK_DIR=$(pwd)
python3 -m venv venv
source venv/bin/activate
pip3 install -U "huggingface_hub[cli]"
# the fp8 checkpoints are around 700GB
mkdir checkpoints
huggingface-cli download --resume-download --local-dir checkpoints/DeepSeek-R1 deepseek-ai/DeepSeek-R1
# my fork of llama.cpp including pr #11446 and some changes to allow converting fp8 hf to bf16 gguf directly using triton(-cpu) without the need of intermediate checkpoints
git clone https://github.com/evshiron/llama.cpp --recursive
pushd llama.cpp
pip3 install -r requirements/requirements-convert_hf_to_gguf.txt
cmake -B build
cmake --build build --config Release
popd
# install triton-cpu for cpu-only dequant
git clone https://github.com/triton-lang/triton-cpu --recursive
pushd triton-cpu
pip3 install ninja cmake wheel pybind11
MAX_JOBS=32 pip3 install -e python
popd
# hopefully it should work, takes an hour or more depending on your hardware, the bf16 checkpoints are around 1.3TB
# the dequant process may take more than 64GB RAM, but should be doable within 360GB RAM
python3 llama.cpp/convert_hf_to_gguf.py --outtype bf16 --split-max-size 50G checkpoints/DeepSeek-R1
# removing the fp8 checkpoints gives us 700GB back
mkdir checkpoints/DeepSeek-R1-BF16
mv checkpoints/DeepSeek-R1/*.gguf checkpoints/DeepSeek-R1-BF16
rm -r checkpoints/DeepSeek-R1
# then use llama-quantize to make the quants you want, Q4_K_M should be around 400GB?
./llama.cpp/build/bin/llama-quantize --keep-split checkpoints/DeepSeek-R1-BF16/<THE_FIRST_OF_DeepSeek-R1-BF16_GGUF>.gguf Q4_K_M
Amazing, thanks for the detailed instructions. Going to give this a shot overnight tonight!