Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering Paper • 2411.16863 • Published 29 days ago
ELSA EU Project Collection Dataset and models created inside the ELSA – European Lighthouse on Secure and Safe AI project on Multimedia use case. • 4 items • Updated 29 days ago
LLaVA-MORE Collection LLaVA-MORE: Enhancing Visual Instruction Tuning with LLaMA 3.1 • 2 items • Updated Aug 31
LLaVA-MORE Collection LLaVA-MORE: Enhancing Visual Instruction Tuning with LLaMA 3.1 • 8 items • Updated Aug 16 • 1