ContactDoctor/Bio-Medical-MultiModal-Llama-3-8B-V1 Image-Text-to-Text โข Updated Oct 17, 2024 โข 1.36k โข 114
Running 136 136 SmolLM 360M Instruct WebGPU ๐ A blazingly fast and powerful AI chatbot that runs locally.
Enhancing the Reasoning Ability of Multimodal Large Language Models via Mixed Preference Optimization Paper โข 2411.10442 โข Published Nov 15, 2024 โข 80
OpenGVLab/InternViT-300M-448px-V2_5 Image Feature Extraction โข Updated Dec 9, 2024 โข 27.4k โข 30
InternVL2.0 Collection Expanding Performance Boundaries of Open-Source MLLM โข 15 items โข Updated Jan 10 โข 91
SigLIP Collection Contrastive (sigmoid) image-text models from https://arxiv.org/abs/2303.15343 โข 10 items โข Updated 2 days ago โข 55
OFA-Sys/chinese-clip-vit-large-patch14-336px Zero-Shot Image Classification โข Updated Dec 9, 2022 โข 1.14k โข 23
google/siglip-base-patch16-224 Zero-Shot Image Classification โข Updated Sep 26, 2024 โข 242k โข โข 41