Zhanliang Liu's picture

20 480

Zhanliang Liu

zliu

·

https://zliu.org

liuzl

AI & ML interests

None yet

Recent Activity

liked a dataset 11 days ago

saiyan-world/Goku-MovieGenBench

upvoted a paper 19 days ago

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

liked a Space about 1 month ago

ysharma/Make_Custom_Voices_With_KokoroTTS

View all activity

Organizations

zliu's activity

upvoted a paper 19 days ago

OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models

Paper • 2502.01061 • Published 20 days ago • 180

upvoted a collection 5 months ago

Moshi v0.1 Release

MLX, Candle & PyTorch model checkpoints released as part of the Moshi release from Kyutai. Run inference via: https://github.com/kyutai-labs/moshi • 13 items • Updated Sep 18, 2024 • 227

upvoted a paper 6 months ago

LLaVA-OneVision: Easy Visual Task Transfer

Paper • 2408.03326 • Published Aug 6, 2024 • 60

upvoted a collection 6 months ago

LLaVA-OneVision

a model good at arbitrary types of visual input • 15 items • Updated Oct 5, 2024 • 22

upvoted 2 collections 7 months ago

Qwen2-Audio

Audio-language model series based on Qwen2 • 4 items • Updated Nov 28, 2024 • 51

LLaVa-Interleave

LLaVa models that extends the model capabilities to Multi-image, Multi-frame (videos), Multi-patch (single-image) scenarios. • 3 items • Updated Jul 10, 2024 • 14

upvoted a paper 12 months ago

Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots

Paper • 2402.10329 • Published Feb 15, 2024 • 15

upvoted a collection about 1 year ago

Quyen

State-of-the-arts General LLMs - based on Qwen1.5 • 26 items • Updated Feb 13, 2024 • 12

upvoted 2 papers about 1 year ago

MobileDiffusion: Subsecond Text-to-Image Generation on Mobile Devices

Paper • 2311.16567 • Published Nov 28, 2023 • 21

Proactive Detection of Voice Cloning with Localized Watermarking

Paper • 2401.17264 • Published Jan 30, 2024 • 18

upvoted a collection about 1 year ago

LLaVA-1.6

A collection of LLaVA-1.6 checkpoints • 4 items • Updated Jan 31, 2024 • 69

upvoted 2 papers about 1 year ago

YOLO-World: Real-Time Open-Vocabulary Object Detection

Paper • 2401.17270 • Published Jan 30, 2024 • 36

Pheme: Efficient and Conversational Speech Generation

Paper • 2401.02839 • Published Jan 5, 2024 • 18

upvoted 3 collections about 1 year ago

Trained Models 🏋️

They may be small, but they're training like giants! • 8 items • Updated Dec 3, 2024 • 17

🐍 Mamba fine-tuned models

A collection with ClibrAIn's Mamba fine-tuned models • 3 items • Updated Dec 18, 2023 • 11

LMM

1 item • Updated Dec 19, 2023 • 1

upvoted a paper about 1 year ago

G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model

Paper • 2312.11370 • Published Dec 18, 2023 • 20

upvoted a collection about 1 year ago

Seamless Communication

A significant step towards removing language barriers through expressive, fast and high-quality AI translation. • 16 items • Updated Jan 16, 2024 • 153

upvoted 2 papers over 1 year ago

Kosmos-2.5: A Multimodal Literate Model

Paper • 2309.11419 • Published Sep 20, 2023 • 50

Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition

Paper • 2307.14535 • Published Jul 26, 2023 • 14