Prompt-Refine Quantized to GGUF Format

#2
by tuolaku - opened

To facilitate local loading and inference for users, I have quantized Prompt-Refine. Currently, there are two options available: Q4_K_M and Q6_K. The former can run on a GPU with 24GB VRAM, while the latter requires a GPU with 32GB VRAM. The above models have been verified in LM Studio.
The download link is: https://huggingface.co/tuolaku/Prompt-Refine-GGUF/tree/main

Sign up or log in to comment