Panic when try to load the model bin files
#2
by
chenhunghan
- opened
There seems to be rust exceptions when loading the model.
(Using llm-rs==0.1.1
)
I tried to load the model like model = Llama("./mpt-7b-q4_0-ggjt.bin")
, but got
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidFormatVersion { container_type: Ggjt, version: 2 }', src/models.rs:5:1
also tried
from llm_rs import Llama
#load the model
model = Llama("cache/mpt-7b-q4_0.bin")
#generate
print(model.generate("The meaning of life is"))
but got
thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })', src/models.rs:5:1
any additional instructions to load the models?
There were breaking changes in the ggml format, you need to use llm-rs==0.2.0 or greater (see here)
Also MPT isn't a LLama model, sou you need to load it via the Mpt
model, you can also see this in the model-card of this repo.
from llm_rs import Mpt
model = Mpt("cache/mpt-7b-q4_0.bin")
Thank you, works well!
LLukas22
changed discussion status to
closed