one gguf model

#2
by andrew-cartwheel - opened

I ran llama cpp conversion to get a GGUF model to test on CPU

https://huggingface.co/andrew-cartwheel/snorkel-mistral-pairRM-DPO-q8_0.gguf

I ran llama cpp conversion to get a GGUF model to test on CPU

https://huggingface.co/andrew-cartwheel/snorkel-mistral-pairRM-DPO-q8_0.gguf

Hi Andrew,

Could you make 4Q M ver ?
If not could you at least gimme a formula on how to do this ? (Step by step as i completely have no idea how to do this)
Q8 is extremely slow .

Absolutely!

Here is a step by step guide https://github.com/ggerganov/llama.cpp/discussions/2948

Every step is outlined there, but if you run into trouble please let me know

Alternatively, it looks like another user uploaded more quants

https://huggingface.co/brittlewis12/Snorkel-Mistral-PairRM-DPO-GGUF/tree/main

Fantastic ! Thanks a lot !

Snorkel AI org

Added both andrew-cartwheel or brittlewis12 models to the model card. Thanks all! Appreciate it!

viethoangtranduong changed discussion status to closed

Sign up or log in to comment