ExLlamaV2 looks very cool!

by julien-c HF staff - opened

re. pushing different quant configs on different branches, i'm wondering if it might not be better to push all to main with different filenames? WDYT?

There are advantages to having them on different branches. For instance you can download each version with huggingface-cli download turboderp/Gemma-7B-exl2 --revision=4.0bpw --local-dir . or some such. And you don't risk people downloading all the weights and confusing the loader, since it considers all .safetensors file in the model dir to be part of the model.

i see! makes sense.

Sign up or log in to comment