Kehan Li's picture

5 14 4

Kehan Li

lkhl

·

AI & ML interests

None yet

Recent Activity

new activity 1 day ago

DAMO-NLP-SG/VL3-SigLIP-NaViT:what is the difference between this model and "DAMO-NLP-SG/SigLIP-NaViT"?

updated a model 5 days ago

DAMO-NLP-SG/VideoLLaMA3-2B-Image

updated a model 5 days ago

DAMO-NLP-SG/VideoLLaMA3-7B-Image

View all activity

Organizations

lkhl's activity

upvoted an article 7 days ago

Article

From DeepSpeed to FSDP and Back Again with Hugging Face Accelerate

Jun 13, 2024

• 47

upvoted a paper 19 days ago

LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

Paper • 2502.13922 • Published 20 days ago • 25

upvoted a collection about 2 months ago

VideoLLaMA3

Frontier Multimodal Foundation Models for Video Understanding • 14 items • Updated 3 minutes ago • 13

upvoted 2 papers about 2 months ago

VideoLLaMA 3: Frontier Multimodal Foundation Models for Image and Video Understanding

Paper • 2501.13106 • Published Jan 22 • 85

MiniMax-01: Scaling Foundation Models with Lightning Attention

Paper • 2501.08313 • Published Jan 14 • 275

upvoted 2 papers 2 months ago

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published Dec 31, 2024 • 41

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 100

upvoted 2 papers 4 months ago

EgoVid-5M: A Large-Scale Video-Action Dataset for Egocentric Video Generation

Paper • 2411.08380 • Published Nov 13, 2024 • 25

Large Language Models Can Self-Improve in Long-context Reasoning

Paper • 2411.08147 • Published Nov 12, 2024 • 64

upvoted 3 collections 5 months ago

Inf-CL

The corresponding demos/checkpoints/papers/datasets of Inf-CL. • 2 items • Updated 3 minutes ago • 3

VideoLLaMA

The first edition of VideoLLaMA • 6 items • Updated 3 minutes ago • 3

VideoLLaMA2

Optimized VideoLLaMA with improved spatial-temporal modeling and better audio understanding capability • 13 items • Updated 3 minutes ago • 20

upvoted 2 papers 5 months ago

Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss

Paper • 2410.17243 • Published Oct 22, 2024 • 90

The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio

Paper • 2410.12787 • Published Oct 16, 2024 • 31