Loading with ggml llava through llama.cpp (PR)

by cmp-nct - opened Nov 22, 2023

Nov 22, 2023

This comment has been hidden

cmp-nct changed discussion title from Can you provide the full CLIP model, not just the extracted vision part ? to never mind Nov 22, 2023

cmp-nct changed discussion status to closed Nov 22, 2023

Ram096

Nov 22, 2023

@cmp-nct can you let me know how you fixed this issue ?

cmp-nct changed discussion title from never mind to Loading with ggml llava Nov 22, 2023

cmp-nct

Nov 22, 2023

@cmp-nct can you let me know how you fixed this issue ?

Sure, I am using ggml llava inference which loads the CLIP model first for conversion, it expects a full model to extract the vision from it but this is already the vision only part.

I wrote a patch: https://github.com/ggerganov/llama.cpp/pull/4172

Anything that looks for "clip_vision_model" is code that expects a full CLIP, in that case you just need to skip the extraction code.

cmp-nct changed discussion status to open Nov 22, 2023

cmp-nct changed discussion title from Loading with ggml llava to Loading with ggml llava through llama.cpp (PR) Nov 22, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment