Spaces:

ggml-org
/

gguf-my-repo

Running on A10G

App Files Files Community

189

Split/shard support

#38

by phymbert - opened Apr 9, 2024

Discussion

phymbert

ggml.ai org Apr 9, 2024

Will it be possible to support model sharding recently introduced in llama.cpp ?

julien-c

Apr 9, 2024

reach-vb

ggml.ai org Apr 16, 2024

Heya! @phymbert - definitely yes, do you mind pointing me to the relevant snippet?

We're currently just quantizing and uploading to the Hub: https://huggingface.co/spaces/ggml-org/gguf-my-repo/blob/main/app.py#L63

Happy for suggestions!

phymbert

ggml.ai org Apr 16, 2024

Hi @reach-vb ,

I wrote a tutorial here: https://github.com/ggerganov/llama.cpp/discussions/6404

The --split-max-size has been fixed recently.

Please ping if you need additional explanations.

Thanks

SixOpen

May 18, 2024

Hello! I have implemented this and will do a PR :) tried to keep the additions minimal to avoid cluttering the interface, so let me know if there's any change I should do to the layout or anything else before merging and will do so gladly.

reach-vb

ggml.ai org May 24, 2024

Closing this as solved! Thanks @SixOpen ❤️

reach-vb changed discussion status to closed May 24, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment