Compared Stepfun Q4K_S vs Unsloth Q4K_S

#5
by slavap5 - opened

They answer practically similar, but when trying to use tools (in Cline) - Stepfun Q4K_S is mostly successful, but Unsloth Q4K_S fails often.
Another strange thing is that PP is similar, but TG is about 15% better for Stepfun Q4K_S (on StrixHalo, the same settings in llama.cpp)

Hello, did you load the model on Windows 11 or Linux? I encountered many difficulties when loading models with weights exceeding 100G on Windows (full GPU loading). I'm not sure on which system did you complete the loading? Can you share some experience breaking through the limitations of StrixHalo! Thank you

Hello, did you load the model on Windows 11 or Linux? I encountered many difficulties when loading models with weights exceeding 100G on Windows (full GPU loading). I'm not sure on which system did you complete the loading? Can you share some experience breaking through the limitations of StrixHalo! Thank you

Ubuntu 24.04 - it is possible to load models up to 126gb size, see https://github.com/kyuz0/amd-strix-halo-toolboxes#kernel-parameters-tested-on-fedora-42

Thank you very much for taking the time out of your busy schedule to answer my question. This may solve many of my problems. Once again, thank you for your help! Thank you

Sign up or log in to comment