yinanhe's picture

yinanhe

ynhe

·

AI & ML interests

None yet

Recent Activity

updated a dataset about 15 hours ago

Vchitect/VBench-2.0_human_anomaly

published a dataset about 15 hours ago

Vchitect/VBench-2.0_human_anomaly

updated a dataset 1 day ago

Vchitect/VBench_human_annotation

View all activity

Organizations

ynhe's activity

upvoted a paper 5 days ago

VBench-2.0: Advancing Video Generation Benchmark Suite for Intrinsic Faithfulness

Paper • 2503.21755 • Published 5 days ago • 30

upvoted a paper about 1 month ago

InternVideo2.5: Empowering Video MLLMs with Long and Rich Context Modeling

Paper • 2501.12386 • Published Jan 21 • 1

upvoted 2 papers 3 months ago

VideoChat-Flash: Hierarchical Compression for Long-Context Video Modeling

Paper • 2501.00574 • Published Dec 31, 2024 • 6

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

Paper • 2412.19326 • Published Dec 26, 2024 • 18

upvoted 2 papers 4 months ago

Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published Dec 16, 2024 • 23

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

Paper • 2411.13503 • Published Nov 20, 2024 • 34

upvoted a paper 7 months ago

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Paper • 2403.15377 • Published Mar 22, 2024 • 25

upvoted 2 collections 11 months ago

InternVL1.0

Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks • 16 items • Updated Jan 10 • 18

InternVideo2

InternVideo2 • 20 items • Updated Feb 27 • 19

upvoted a collection about 1 year ago

VideoMamba

State Space Model for Efficient Video Understanding • 5 items • Updated Jan 10 • 5

upvoted a paper about 1 year ago

VideoMamba: State Space Model for Efficient Video Understanding

Paper • 2403.06977 • Published Mar 11, 2024 • 29

upvoted 7 papers over 1 year ago

MVBench: A Comprehensive Multi-modal Video Understanding Benchmark

Paper • 2311.17005 • Published Nov 28, 2023 • 2

VBench: Comprehensive Benchmark Suite for Video Generative Models

Paper • 2311.17982 • Published Nov 29, 2023 • 9

LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models

Paper • 2309.15103 • Published Sep 26, 2023 • 42

The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World

Paper • 2308.01907 • Published Aug 3, 2023 • 12

VideoChat: Chat-Centric Video Understanding

Paper • 2305.06355 • Published May 10, 2023 • 3

InternChat: Solving Vision-Centric Tasks by Interacting with Chatbots Beyond Language

Paper • 2305.05662 • Published May 9, 2023 • 4

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

Paper • 2307.06942 • Published Jul 13, 2023 • 23