Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines Paper • 2410.21220 • Published Oct 28 • 10
LongReward: Improving Long-context Large Language Models with AI Feedback Paper • 2410.21252 • Published Oct 28 • 17
GrounDiT: Grounding Diffusion Transformers via Noisy Patch Transplantation Paper • 2410.20474 • Published Oct 27 • 14