-
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 74 -
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models
Paper • 2401.15947 • Published • 51 -
Video-LLaVA: Learning United Visual Representation by Alignment Before Projection
Paper • 2311.10122 • Published • 27 -
Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models
Paper • 2311.16103 • Published • 1
Mikhail
Dremin
·
AI & ML interests
None yet
Organizations
None yet
Collections
1
models
None public yet
datasets
None public yet