Paint by Inpaint: Learning to Add Image Objects by Removing Them First Paper • 2404.18212 • Published Apr 28 • 27
Question Aware Vision Transformer for Multimodal Reasoning Paper • 2402.05472 • Published Feb 8 • 8
Surface Reconstruction from Gaussian Splatting via Novel Stereo Views Paper • 2404.01810 • Published Apr 2 • 3
FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions Paper • 2305.17718 • Published May 28, 2023 • 2