ReflectiVA Collection Models and data for ReflectiVA: Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering [CVPR 2025] • 2 items • Updated 11 days ago
LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning Paper • 2503.15621 • Published 17 days ago
Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering Paper • 2411.16863 • Published Nov 25, 2024
ELSA EU Project Collection Dataset and models created inside the ELSA – European Lighthouse on Secure and Safe AI project on Multimedia use case. • 4 items • Updated Nov 25, 2024
LLaVA-MORE Collection LLaVA-MORE: Enhancing Visual Instruction Tuning with LLaMA 3.1 • 2 items • Updated Aug 31, 2024
aimagelab/LLaVA_MORE-llama_3_1-8B-S2-siglip-finetuning Image-Text-to-Text • Updated Aug 16, 2024 • 6 • 2
LLaVA-MORE Collection LLaVA-MORE: A Comparative Study of LLMs and Visual Backbones for Enhanced Visual Instruction Tuning • 8 items • Updated 10 days ago • 2