Hi , how to run it?

#1
by carlosbdw - opened

I am very interesting at this med model .Could you please give some demo python code? Thanks!

Hi, to run this model you'll need to load the 4bit quantization using GPTQ and then use a 4bit "monkey-path" to ensure you can use 4-bit loras with this model.

I used text generation ui to run these models: https://github.com/oobabooga/text-generation-webui/tree/main
They provide good steps to get 4-bit models and 4-bit LoRas working: https://github.com/oobabooga/text-generation-webui/blob/main/docs/GPTQ-models-(4-bit-mode).md

nmitchko changed discussion status to closed

Sign up or log in to comment