larry's picture

larry

szh

·

AI & ML interests

None yet

Organizations

szh's activity

upvoted a paper about 1 month ago

Goku: Flow Based Video Generative Foundation Models

Paper • 2502.04896 • Published Feb 7 • 97

upvoted a paper about 2 months ago

VideoJAM: Joint Appearance-Motion Representations for Enhanced Motion Generation in Video Models

Paper • 2502.02492 • Published Feb 4 • 62

upvoted 2 collections 4 months ago

PaliGemma 2 Release

Vision-Language Models available in multiple 3B, 10B and 28B variants. • 32 items • Updated 11 days ago • 145

Sana

⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated 7 days ago • 89

upvoted 2 papers 5 months ago

HART: Efficient Visual Generation with Hybrid Autoregressive Transformer

Paper • 2410.10812 • Published Oct 14, 2024 • 17

Progressive Autoregressive Video Diffusion Models

Paper • 2410.08151 • Published Oct 10, 2024 • 16

upvoted a paper 6 months ago

PhysGen: Rigid-Body Physics-Grounded Image-to-Video Generation

Paper • 2409.18964 • Published Sep 27, 2024 • 26

upvoted a paper 7 months ago

Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models

Paper • 2408.04594 • Published Aug 8, 2024 • 15

upvoted a paper 9 months ago

Cambrian-1: A Fully Open, Vision-Centric Exploration of Multimodal LLMs

Paper • 2406.16860 • Published Jun 24, 2024 • 60

upvoted a paper 11 months ago

Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings

Paper • 2404.16820 • Published Apr 25, 2024 • 17

upvoted 6 papers over 1 year ago

Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Paper • 2311.06783 • Published Nov 12, 2023 • 28

Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models

Paper • 2310.13127 • Published Oct 19, 2023 • 12

Improved Baselines with Visual Instruction Tuning

Paper • 2310.03744 • Published Oct 5, 2023 • 37

Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency

Paper • 2310.03734 • Published Oct 5, 2023 • 15

Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 30

An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models

Paper • 2309.09958 • Published Sep 18, 2023 • 19