vicuna 1.1 13b q4_1 failed to load (bad float16)
#9
by
couchpotato888
- opened
Try updating your llama.cppgit pull
and then make
Also check your sha256, maybe you've got a corrupted file
Hi there, i am using this mode ggml-vicuna-13b-4bit-rev1.bin but it takes too much time to return completion tokens. it takes almost 30 minutes to return token. any optimize way to use llama vicuna model?