Text Generation
Transformers
English
mpt
llm-rs
ggml
text-generation-inference

Panic when try to load the model bin files

#2
by chenhunghan - opened

There seems to be rust exceptions when loading the model.

(Using llm-rs==0.1.1)

I tried to load the model like model = Llama("./mpt-7b-q4_0-ggjt.bin"), but got

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: InvalidFormatVersion { container_type: Ggjt, version: 2 }', src/models.rs:5:1

also tried

from llm_rs import Llama

#load the model
model = Llama("cache/mpt-7b-q4_0.bin")

#generate
print(model.generate("The meaning of life is"))

but got

thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: Io(Error { kind: UnexpectedEof, message: "failed to fill whole buffer" })', src/models.rs:5:1

any additional instructions to load the models?

rustformers org

There were breaking changes in the ggml format, you need to use llm-rs==0.2.0 or greater (see here)

Also MPT isn't a LLama model, sou you need to load it via the Mpt model, you can also see this in the model-card of this repo.

from llm_rs import Mpt

model = Mpt("cache/mpt-7b-q4_0.bin")

Thank you, works well!

LLukas22 changed discussion status to closed

Sign up or log in to comment