Med-Flamingo-9B (CLIP ViT-L/14, Llama-7B)
Med-Flamingo is a medical vision-language model with multimodal in-context learning abilities.
This model is based on the OpenFlamingo-9B V1 model which uses the CLIP ViT-L/14 vision encoder and the Llama-7B language model as frozen backbones.
Med-Flamingo was trained on paired and interleaved image-text from the medical literature.
Check out our git repo for more details on setup & demo.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support