Weights are not present in the repo

#1
by julien-c HF staff - opened

see title

Qualcomm org
β€’
edited Feb 27

We are aware of this issue. We currently can't distribute the quantized weights so we are working on a way to produce these weights automatically. Stay tuned! This is the highest priority for us.

Do you have any guidance on converting the llama model to the qnn model @srikris

Qualcomm org

@eoe @julien-c We are working on publishing recipes for you to easily quantize this model and produce the weights. Should have something soon.

@srikris Hi Krishna, any progress on the matter, how can we convert the model and use it?

Qualcomm org

Almost there @ShaneRush . We have figured it out. Now just testing and polishing.

exciting stuff!

Any news?

Qualcomm org

Targeting the next release. (We release roughly every 2 weeks)

Qualcomm org

Llama2 with pre-computed quantization encodings is released.
Now you can use AI Hub Llama2 model to export models to QNN context binary and verify with demo.py on-device.

@bhushans @srikris would you share what you have release ? How could we get the release instructions ? is there any page document that you could share here , we appreciate.

Qualcomm org

Sorry for the late response. This page has all the information https://aihub.qualcomm.com/mobile/models/llama_v2_7b_chat_quantized?domain=Generative+AI

Sign up or log in to comment