CoLLM: A Large Language Model for Composed Image Retrieval Paper • 2503.19910 • Published 23 days ago • 11
AlignVLM: Bridging Vision and Language Latent Spaces for Multimodal Understanding Paper • 2502.01341 • Published Feb 3 • 39