Zhe Gan's picture

1 7

Zhe Gan

zhegan27

·

http://zhegan27.github.io/

zhegan27

AI & ML interests

multimodal learning, vision and language

Recent Activity

upvoted a paper 15 days ago

STIV: Scalable Text and Image Conditioned Video Generation

authored a paper about 1 month ago

Multimodal Autoregressive Pre-training of Large Vision Encoders

View all activity

Organizations

None yet

Papers 8

arxiv:2411.14402

arxiv:2410.07177

arxiv:2409.20566

arxiv:2407.15841

models

None public yet

datasets

None public yet