Text Generation
Transformers
GGUF
15 languages
Inference Endpoints

Do I require config.json to run in OLLAMA with FP16.gguf? If so, where can I find it?

#1
by NeevrajKB - opened

Do I require config.json to run in OLLAMA with FP16.gguf? If so, where can I find it?
BTW, how much performance-accuracy loss in fp16 compared to original non-gguf model of same specs?
Also any idea about which is the best time:accuracy:weight model for performance on modest hardware?
I am trying with OLLAMA due to errors with Huggingface and unsloth. If i have errors in ollama, can i ask you guys about it here?
Also guys, how can i rag with this in python?
Thanks!

Sign up or log in to comment