one gguf model

by andrew-cartwheel - opened Jan 23

Discussion

andrew-cartwheel

Jan 23

I ran llama cpp conversion to get a GGUF model to test on CPU

https://huggingface.co/andrew-cartwheel/snorkel-mistral-pairRM-DPO-q8_0.gguf

Pumba2

Jan 24

•

edited Jan 24

I ran llama cpp conversion to get a GGUF model to test on CPU

https://huggingface.co/andrew-cartwheel/snorkel-mistral-pairRM-DPO-q8_0.gguf

Hi Andrew,

Could you make 4Q M ver ?
If not could you at least gimme a formula on how to do this ? (Step by step as i completely have no idea how to do this)
Q8 is extremely slow .

andrew-cartwheel

Jan 24

•

edited Jan 24

Absolutely!

Here is a step by step guide https://github.com/ggerganov/llama.cpp/discussions/2948

Every step is outlined there, but if you run into trouble please let me know

Alternatively, it looks like another user uploaded more quants

https://huggingface.co/brittlewis12/Snorkel-Mistral-PairRM-DPO-GGUF/tree/main

Pumba2

Jan 24

Fantastic ! Thanks a lot !

viethoangtranduong

Snorkel AI org Mar 5

Added both andrew-cartwheel or brittlewis12 models to the model card. Thanks all! Appreciate it!

viethoangtranduong changed discussion status to closed Mar 5

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment