llama.cpp

by l0d0v1c - opened May 14, 2023

May 14, 2023

Hi,

I got "libc++abi: terminating due to uncaught exception of type std::runtime_error: unexpectedly reached end of file" with llama.cpp

LLukas22

rustformers org May 14, 2023

@luc18 the Quantization format in llama.cpp was recently changed (see this) you can use the f16 weights which will still work.
I will probably upload the new converted weights later today and mark them with a V2.

The weights in this repo were created for development purposes in the rustformers/llm repo.

l0d0v1c

May 15, 2023

Great! Thank you. I'll try f16

l0d0v1c

May 15, 2023

I tried former version of llama.cpp. Same error. F16 too. With rustformers/llm I get (f16 and q4_0):

llm llama infer -m mpt-7b-q4_0.bin -p "Tell me how to make handmade soap"
⣾ Loading model...Error:
0: Could not load model
1: unsupported f16_: 13

Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
Run with RUST_BACKTRACE=full to include source snippets.

LLukas22

rustformers org May 15, 2023

@luc18 Oh sorry I'm always thinking from the developer perspective😅. MPT support is still in development and has some bugs, llama.cpp also wont include it as it is not based on the LLaMA architecture. GGML will include it (see this pull-request) and Rustformers is also working on an implementation (see this pull-request).

I will add a disclaimer to this repo, to hint at the 'still in development status'.

Expect it to be finished in a few days. I will then add instructions on how to use these models and '@' at you again to signal its ready.

Azamorn

May 15, 2023

Also really looking forward to this!

l0d0v1c

May 15, 2023

@LLukas22 Sorry I didn't get it... Looking forward your release.

gabluz

May 17, 2023

Not working yet with koboldcpp (uses llamacpp). Waiting for a new release... valeu Lukas.

darxkies

May 18, 2023

The chat version works with neither koboldcpp nor llama.cpp. The checksum of the bin file is ok. I use the master version of both programs.

LLukas22

rustformers org May 18, 2023

•

edited May 18, 2023

@darxkies MPT will not be supported in llama.cpp as it is not based on the LLama architecture. Currently it is only supported as an example in GGML directly, the usage is described in the README.

Simpler to use Python/Rust implementations are not ready yet.

If you want this supported in koboldcpp, you should probably open an issue there.

darxkies

May 18, 2023

Ok. Thank you.

LLukas22

rustformers org May 19, 2023

@Azamorn @luc18 @gabluz Development has progressed to a point where most MPT models can be run. I updated the README with instructions on how to run these models in Python/Rust/C.

l0d0v1c

Jun 6, 2023

Great! Thanks. Works with ggml. But not with rust nor python (mac M2).

LLukas22

rustformers org Jun 6, 2023

@luc18 Hm thats weird, are you using the latest versions of the python package/rust project? It should be tested on all platforms, if you still have issues on the newest versions please create an issue here

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment