8 12 26

Kunchang Li

Andy1621

https://github.com/Andy1621

Andy1621

AI & ML interests

computer vision

Recent Activity

authored a paper 14 days ago

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

upvoted a paper 25 days ago

Qwen2.5 Technical Report

upvoted a paper 27 days ago

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

View all activity

Organizations

Andy1621's activity

authored a paper 14 days ago

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

Paper • 2412.19326 • Published 18 days ago • 18

upvoted a paper 25 days ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published 25 days ago • 339

upvoted a paper 27 days ago

VividFace: A Diffusion-Based Hybrid Framework for High-Fidelity Video Face Swapping

Paper • 2412.11279 • Published 29 days ago • 12

authored a paper 28 days ago

Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published 28 days ago • 23

upvoted a paper 28 days ago

Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published 28 days ago • 23

commented a paper 28 days ago

Causal Diffusion Transformers for Generative Modeling

Paper • 2412.12095 • Published 28 days ago • 23 •

authored a paper 28 days ago

Bootstrapping Language-Guided Navigation Learning with Self-Refining Data Flywheel

Paper • 2412.08467 • Published Dec 11, 2024 • 5

upvoted 2 papers about 1 month ago

EasyRef: Omni-Generalized Group Image Reference for Diffusion Models via Multimodal LLM

Paper • 2412.09618 • Published Dec 12, 2024 • 21

StreamChat: Chatting with Streaming Video

Paper • 2412.08646 • Published Dec 11, 2024 • 18

authored a paper 3 months ago

TransAgent: Transfer Vision-Language Foundation Models with Heterogeneous Agent Collaboration

Paper • 2410.12183 • Published Oct 16, 2024 • 3

liked a model 4 months ago

OpenGVLab/UMT

Video Classification • Updated Aug 17, 2024 • 1

updated a model 6 months ago

Andy1621/VideoChat2_VicunaV0_7B_stage3_noLoRA

Updated Jul 30, 2024

liked a model 7 months ago

stabilityai/stable-diffusion-3-medium

Text-to-Image • Updated Aug 12, 2024 • 20.1k • 4.66k

upvoted a paper 7 months ago

Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 66

liked 2 models 9 months ago

internlm/internlm2-chat-20b

Text Generation • Updated Aug 20, 2024 • 4.11k • 87

OpenGVLab/InternVL-Chat-V1-5

Image-Text-to-Text • Updated 27 days ago • 2.55k • 405

upvoted a paper 9 months ago

Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction

Paper • 2404.02905 • Published Apr 3, 2024 • 65

authored a paper 10 months ago

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Paper • 2403.15377 • Published Mar 22, 2024 • 22

upvoted a paper 10 months ago

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Paper • 2403.15377 • Published Mar 22, 2024 • 22

New activity in OpenGVLab/VideoMamba 10 months ago

Local demo on the repo

#4 opened 10 months ago by

ysharma