Regarding the device?
Can we use this on a CPU machine? I mean without cuda or mps. I'm hoping that you will definitely give me a reply.
Yeah, The-Bloke has kindly quantized all orca-minis as a service to the community. Respect.
Please follow instructions from his repos and ask community on using GGML/GPTQ version on CPU.
https://huggingface.co/TheBloke/orca_mini_3B-GPTQ https://huggingface.co/TheBloke/orca_mini_3B-GGML
https://huggingface.co/TheBloke/orca_mini_7B-GPTQ https://huggingface.co/TheBloke/orca_mini_7B-GGML
https://huggingface.co/TheBloke/orca_mini_13B-GPTQ https://huggingface.co/TheBloke/orca_mini_13B-GGML
https://huggingface.co/TheBloke/orca_mini_3B-GPTQ https://huggingface.co/TheBloke/orca_mini_3B-GGML
https://huggingface.co/TheBloke/orca_mini_7B-GPTQ https://huggingface.co/TheBloke/orca_mini_7B-GGML
https://huggingface.co/TheBloke/orca_mini_13B-GPTQ https://huggingface.co/TheBloke/orca_mini_13B-GGML