Quants for InternVL2.5?

#1
by Koitenshin - opened

It's a big ask, but it's the "little sister" of this model. It's only 26B but it uses the same 6B Vision Transformer as this 38B model.

https://huggingface.co/OpenGVLab/InternVL2_5-26B

EDIT: Currently Frankensteined the InternVL3-14B together with the 11 GB FP16 file, and my god I can tell it's Qwen.

It wrote a very good system prompt for captioning images, but the moment you start using the prompt it proceeds to complain non-stop about adhering to ethical guidelines. I'm just trying to caption a blasted dataset for finetuning a model, but I need the best VL there is that can handle high res images. (╯°□°)╯︵ ┻━┻

Sign up or log in to comment