8 63 20

Zesen Cheng

ClownRat

AI & ML interests

multi-modal foundation model; Segmentation, Detection, and Tracking;

Recent Activity

upvoted a paper 7 days ago

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

authored a paper 7 days ago

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

upvoted a paper 13 days ago

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

View all activity

Organizations

ClownRat's activity

upvoted a paper 7 days ago

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

Paper • 2503.14428 • Published 14 days ago • 8

authored a paper 7 days ago

MagicComp: Training-free Dual-Phase Refinement for Compositional Video Generation

Paper • 2503.14428 • Published 14 days ago • 8

upvoted 2 papers 13 days ago

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Paper • 2411.10440 • Published Nov 15, 2024 • 123

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Paper • 2410.18558 • Published Oct 24, 2024 • 20

upvoted a paper 18 days ago

Transformers without Normalization

Paper • 2503.10622 • Published 19 days ago • 146

upvoted 2 papers about 1 month ago

LongRoPE2: Near-Lossless LLM Context Window Scaling

Paper • 2502.20082 • Published Feb 27 • 36

Self-rewarding correction for mathematical reasoning

Paper • 2502.19613 • Published Feb 26 • 82

authored a paper about 1 month ago

Instance Brownian Bridge as Texts for Open-vocabulary Video Instance Segmentation

Paper • 2401.09732 • Published Jan 18, 2024

upvoted 2 articles about 1 month ago

Article

Mixture of Experts Explained

Dec 11, 2023

• 499

Article

SigLIP 2: A better multilingual vision language encoder

Feb 21

• 148

upvoted a paper about 1 month ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 176

authored a paper about 1 month ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published Feb 19 • 176

upvoted 2 papers about 1 month ago

Small Models Struggle to Learn from Strong Reasoners

Paper • 2502.12143 • Published Feb 17 • 33

LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization

Paper • 2502.13922 • Published Feb 19 • 25

upvoted 2 papers about 2 months ago

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 112

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 367

liked a dataset about 2 months ago

OpenGVLab/OmniCorpus-YT

Updated 12 days ago • 214 • 12

upvoted 3 papers 2 months ago