AwesomeLLMs - a Hanyu66 Collection

Hanyu66 's Collections

AwesomeLLMs

updated Mar 22

Zero-Shot and Few-Shot Video Question Answering with Multi-Modal Prompts

Paper • 2309.15915 • Published Sep 27, 2023 • 2
Reformulating Vision-Language Foundation Models and Datasets Towards Universal Multimodal Assistants

Paper • 2310.00653 • Published Oct 1, 2023 • 3
Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

Paper • 2308.12966 • Published Aug 24, 2023 • 7
An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Paper • 2309.09958 • Published Sep 18, 2023 • 18
Position-Enhanced Visual Instruction Tuning for Multimodal Large Language Models

Paper • 2308.13437 • Published Aug 25, 2023 • 3