Crashes when I try to load it in koboldcpp
Basically the title, same issue with bartowskis quants too. Maybe koboldcpp doesnt have the required upstream merges from llamacpp yet? Wondering if someone can confirm. I tested lost ruins koboldcpp with openblas and vulkan both, and yellowroses hipblas fork, neither can load this model. Tested with Q4k_M
Yes, we'll have to wait a bit until KoboldCPP updates this change:
https://github.com/LostRuins/koboldcpp/commit/889bdd76866ea31a7625ec2dcea63ff469f3e981
@Elfrino I compiled llama.cpp with the latest code. The error persists.
Hmm. The same error I presume or a different one?
My error is
check_tensor_dims: tensor 'token_embd.weight' has wrong shape
🤔Did you load successfully?
I just tried it with the new KoboldCPP just released, works fine :)
@Elfrino I compiled llama.cpp with the latest code. The error persists.
Hmm. The same error I presume or a different one?
My error is
check_tensor_dims: tensor 'token_embd.weight' has wrong shape
🤔Did you load successfully?
I just tried it with the new KoboldCPP just released, works fine :)
WoW! I know what happens.
GGUFs in bartowski/35b-beta-long-GGUF can be loaded correctly.
GGUFs in this repo will be mistakenly recognized as type llama
(should be command-r
).
@Elfrino I compiled llama.cpp with the latest code. The error persists.
Hmm. The same error I presume or a different one?
My error is
check_tensor_dims: tensor 'token_embd.weight' has wrong shape
🤔Did you load successfully?
I just tried it with the new KoboldCPP just released, works fine :)
WoW! I know what happens.
GGUFs in bartowski/35b-beta-long-GGUF can be loaded correctly.
GGUFs in this repo will be mistakenly recognized as type
llama
(should becommand-r
).
You mean it works with the new version of KoboldCPP? Or does it work with the older version too?
@Elfrino I compiled llama.cpp with the latest code. The error persists.
Hmm. The same error I presume or a different one?
My error is
check_tensor_dims: tensor 'token_embd.weight' has wrong shape
🤔Did you load successfully?
I just tried it with the new KoboldCPP just released, works fine :)
WoW! I know what happens.
GGUFs in bartowski/35b-beta-long-GGUF can be loaded correctly.
GGUFs in this repo will be mistakenly recognized as type
llama
(should becommand-r
).You mean it works with the new version of KoboldCPP? Or does it work with the older version too?
Only new version of KoboldCPP & llama.cpp work. They support the command-r type which Causallm-35B is based on.
This repo is deprecated as the model should be quantized again. bartowski/35b-beta-long-GGUF is correct.