vikhyatk/moondream2 · Ollama and llama.cpp

Apr 28

First of all, congrats on this incredible good model. I'm running this on the CPU of my laptop and can get captions at a rate of about 2-3 images per minute. The caption quality is comparable to Llava 1.6 running at 4-bit quantization with Ollama, maybe moondream hallucinates a little less than llava.

Would you be interested in sharing this model in the Ollama library? Ollama (and it's backend llama.cpp) now support a Vulkan backend, which means I will be able to run this on my laptops iGPU. With Llava 1.6, the speedup is more than x2.

KeilahElla

May 19

@vikhyatk , I see now that moondream is now on ollama library: https://www.ollama.com/library/moondream

Do you know which version is this? I prefer to use it through ollama because it's much faster than transformers. But I also want to use the latest version and do not want to fall behind.

Rendomman067

May 24

This comment has been hidden

Rendomman067

May 24

This comment has been hidden

vikhyatk

Owner May 25

@vikhyatk , I see now that moondream is now on ollama library: https://www.ollama.com/library/moondream

Do you know which version is this? I prefer to use it through ollama because it's much faster than transformers. But I also want to use the latest version and do not want to fall behind.

It may be an older version actually, I’m not sure how it gets updated. Will try to find out.