FastRM: An efficient and automatic explainability framework for multimodal generative models Paper • 2412.01487 • Published 20 days ago • 1
LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model Paper • 2404.01331 • Published Mar 29 • 25
Getting it Right: Improving Spatial Consistency in Text-to-Image Models Paper • 2404.01197 • Published Apr 1 • 30