Possible to add 4_0_4_4 and 4_0_4_8 quants?
#1
by
asdfsdfssddf
- opened
Hi, as the title says, I recently discovered my phone can run LLM's pretty well and even managed to get koboldcpp running on it to share text gen with horde. Would you consider providing the ARM inference optimized quants?
Is there a reason you couldn't just use the much higher quality imatrix ones instead? Should be same size and speed, but much higher quality.
mradermacher
changed discussion status to
closed