Inference error: The current context does not support K-shift
#3 opened 4 days ago
by
lollmaolol
Tested Q6, uses 567Gb Ram
5
#2 opened 10 days ago
by
krustik
Using -ctk q4_0 -ctv q4_0 with llama.cpp server throws flash_attn error
#1 opened 11 days ago
by
softwareweaver