linbin's picture

linbin

LanguageBind

·

https://github.com/LinB203

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

updated a model 7 days ago

LanguageBind/t2i_ablation_arch

updated a model 10 days ago

LanguageBind/UniVA

View all activity

Organizations

LanguageBind's activity

upvoted a paper 5 days ago

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Paper • 2504.02782 • Published 5 days ago • 52

upvoted 4 papers 4 months ago

ILLUME: Illuminating Your LLMs to See, Draw, and Self-Enhance

Paper • 2412.06673 • Published Dec 9, 2024 • 11

Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling

Paper • 2412.05271 • Published Dec 6, 2024 • 152

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

Paper • 2411.17459 • Published Nov 26, 2024 • 11

Open-Sora Plan: Open-Source Large Video Generation Model

Paper • 2412.00131 • Published Nov 28, 2024 • 34

upvoted a paper 9 months ago

ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation

Paper • 2406.18522 • Published Jun 26, 2024 • 21

upvoted a paper 10 months ago

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6, 2024 • 76

upvoted a paper about 1 year ago

MoE-LLaVA: Mixture of Experts for Large Vision-Language Models

Paper • 2401.15947 • Published Jan 29, 2024 • 52

upvoted a paper over 1 year ago

Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models

Paper • 2311.16103 • Published Nov 27, 2023 • 1

upvoted 4 collections over 1 year ago

Video-LLaVA 1.5 Model

a collection of Video-LLaVA 1.0 • 1 item • Updated Jan 28, 2024 • 2

LanguageBind 1.5 Model

a collection of LanguageBind based on VIDAL-45M • 2 items • Updated May 23, 2024 • 2

LanguageBind 1.0 Model

a collection of LanguageBind based on VIDAL-10M • 9 items • Updated Jan 28, 2024 • 4

Video-LLaVA 1.0 Model

a collection of Video-LLaVA 1.0 • 3 items • Updated May 23, 2024 • 5

upvoted 2 papers over 1 year ago

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Paper • 2310.01852 • Published Oct 3, 2023 • 2

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Paper • 2311.10122 • Published Nov 16, 2023 • 27