Img-txt Question Aware Vision Transformer for Multimodal Reasoning Paper • 2402.05472 • Published Feb 8 • 7
LM How to Train Data-Efficient LLMs Paper • 2402.09668 • Published Feb 15 • 38 An Introduction to Vision-Language Modeling Paper • 2405.17247 • Published May 27 • 84