Lame question about the multimodality.

#6
by 010O11 - opened

Sorry if I'm asking on the wrong place.... I'm not familiar with the multifunctionality of the model/models and am interrested to try it. I'm using koboldcpp + sillytavern. How can I do it? There is suggestion in the model card about 'mmproj', so I did download it, am able to set it up like its in the model card but now what? Can you please reveal me th euse case? There should been something set in the sillytavern, am I right? Extras/image generation? or it's not for generating images, it is for image recognition?

The Chaotic Neutrals org
edited May 6

Sorry if I'm asking on the wrong place.... I'm not familiar with the multifunctionality of the model/models and am interrested to try it. I'm using koboldcpp + sillytavern. How can I do it? There is suggestion in the model card about 'mmproj', so I did download it, am able to set it up like its in the model card but now what? Can you please reveal me th euse case? There should been something set in the sillytavern, am I right? Extras/image generation? or it's not for generating images, it is for image recognition?

@010O11 No worries, try this bellow:

Load the mmproj along with a llama 3 8B model of choice inside kcpp.

  • Then inside ST go to the image captioning extension and set kcpp as the source.
    image.png
  • After that send an image to be captioned in chat.
    image.png

ok, thx, just tried it....it works, thanks. Is it the right behaviour that in the web UI appears the image and the caption is in the CLI only? nevermind, thats not your problem, thank you again for your education

Nitral-AI changed discussion status to closed

Sign up or log in to comment