InstructDoc: A Dataset for Zero-Shot Generalization of Visual Document Understanding with Instructions Paper • 2401.13313 • Published Jan 24 • 4
Jina CLIP: Your CLIP Model Is Also Your Text Retriever Paper • 2405.20204 • Published 16 days ago • 27
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model Paper • 2401.09417 • Published Jan 17 • 51