EVLM: An Efficient Vision-Language Model for Visual Understanding Paper • 2407.14177 • Published 14 days ago • 41
ChartGemma: Visual Instruction-tuning for Chart Reasoning in the Wild Paper • 2407.04172 • Published 28 days ago • 21
E5-V: Universal Embeddings with Multimodal Large Language Models Paper • 2407.12580 • Published 16 days ago • 38
Wolf: Captioning Everything with a World Summarization Framework Paper • 2407.18908 • Published 6 days ago • 28