AIMv2 Collection A collection of AIMv2 vision encoders that supports a number of resolutions, native resolution, and a distilled checkpoint. • 19 items • Updated Nov 22, 2024 • 74
LLM2CLIP Collection LLM2CLIP makes SOTA pretrained CLIP modal more SOTA ever. • 11 items • Updated 2 days ago • 55
BRAVE: Broadening the visual encoding of vision-language models Paper • 2404.07204 • Published Apr 10, 2024 • 19
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 576
NIM Serverless Inference API Collection Models in this collection are available for inference via a serverless API powered by NVIDIA NIM. • 8 items • Updated Jan 17 • 23
Qwen2-VL Collection Vision-language model series based on Qwen2 • 16 items • Updated Dec 6, 2024 • 208
InternVL2.0 Collection Expanding Performance Boundaries of Open-Source MLLM • 15 items • Updated Jan 10 • 91