Split/shard support

#38
by phymbert - opened
ggml.ai org

Will it be possible to support model sharding recently introduced in llama.cpp ?

ggml.ai org

Heya! @phymbert - definitely yes, do you mind pointing me to the relevant snippet?

We're currently just quantizing and uploading to the Hub: https://huggingface.co/spaces/ggml-org/gguf-my-repo/blob/main/app.py#L63

Happy for suggestions!

ggml.ai org

Hi @reach-vb ,

I wrote a tutorial here: https://github.com/ggerganov/llama.cpp/discussions/6404

The --split-max-size has been fixed recently.

Please ping if you need additional explanations.

Thanks

Sign up or log in to comment