Weights are not present in the repo

by julien-c HF staff - opened Feb 26, 2024

Discussion

julien-c

Feb 26, 2024

see title

srikris

Qualcomm org Feb 27, 2024

•

edited Feb 27, 2024

We are aware of this issue. We currently can't distribute the quantized weights so we are working on a way to produce these weights automatically. Stay tuned! This is the highest priority for us.

eoe

Mar 18, 2024

Do you have any guidance on converting the llama model to the qnn model @srikris

srikris

Qualcomm org Mar 19, 2024

@eoe @julien-c We are working on publishing recipes for you to easily quantize this model and produce the weights. Should have something soon.

ShaneRush

Apr 3, 2024

@srikris Hi Krishna, any progress on the matter, how can we convert the model and use it?

srikris

Qualcomm org Apr 3, 2024

Almost there @ShaneRush . We have figured it out. Now just testing and polishing.

julien-c

Apr 3, 2024

exciting stuff!

finalf0

Apr 24, 2024

Any news?

srikris

Qualcomm org May 7, 2024

Targeting the next release. (We release roughly every 2 weeks)

bhushans

Qualcomm org May 29, 2024

Llama2 with pre-computed quantization encodings is released.
Now you can use AI Hub Llama2 model to export models to QNN context binary and verify with demo.py on-device.

zylinux

Jun 21, 2024

@bhushans @srikris would you share what you have release ? How could we get the release instructions ? is there any page document that you could share here , we appreciate.

shreyajn

Qualcomm org Jul 9, 2024

Sorry for the late response. This page has all the information https://aihub.qualcomm.com/mobile/models/llama_v2_7b_chat_quantized?domain=Generative+AI

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment