Pichao Wang

wangpichao

AI & ML interests

None yet

Recent Activity

upvoted a paper 2 days ago

Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation

upvoted a paper 2 days ago

The Amazon Nova Family of Models: Technical Report and Model Card

upvoted a paper 2 days ago

H_{2}OT: Hierarchical Hourglass Tokenizer for Efficient Video Pose Transformers

View all activity

Organizations

None yet

upvoted 17 papers 2 days ago

Do Audio LLMs Really LISTEN, or Just Transcribe? Measuring Lexical vs. Acoustic Emotion Cues Reliance

Paper • 2510.10444 • Published Oct 17, 2025 • 1

Ctrl&Shift: High-Quality Geometry-Aware Object Manipulation in Visual Generation

Paper • 2602.11440 • Published Feb 11 • 1

Factorized Visual Tokenization and Generation

Paper • 2411.16681 • Published Nov 25, 2024 • 20

Revisit Parameter-Efficient Transfer Learning: A Two-Stage Paradigm

Paper • 2303.07910 • Published Mar 14, 2023 • 2

Making Vision Transformers Efficient from A Token Sparsification View

Paper • 2303.08685 • Published Mar 15, 2023 • 1

Audio-Enhanced Text-to-Video Retrieval using Text-Conditioned Feature Alignment

Paper • 2307.12964 • Published Jul 24, 2023 • 1

TransReID: Transformer-based Object Re-Identification

Paper • 2102.04378 • Published Feb 8, 2021 • 1

Revisiting Vision Transformer from the View of Path Ensemble

Paper • 2308.06548 • Published Aug 12, 2023 • 1

Hallucination of Multimodal Large Language Models: A Survey

Paper • 2404.18930 • Published Apr 29, 2024 • 1

FlexDiT: Dynamic Token Density Control for Diffusion Transformer

Paper • 2412.06028 • Published Dec 8, 2024 • 1

Selective Structured State-Spaces for Long-Form Video Understanding

Paper • 2303.14526 • Published Mar 25, 2023 • 1

Social Structure Matters in 3D Human-Human Interaction Generation

Paper • 2606.24255 • Published 10 days ago • 1

SCT: A Simple Baseline for Parameter-Efficient Fine-Tuning via Salient Channels

Paper • 2309.08513 • Published Sep 15, 2023 • 3

One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos

Paper • 2409.19603 • Published Sep 29, 2024 • 19

upvoted a paper over 1 year ago

Impossible Videos

Paper • 2503.14378 • Published Mar 18, 2025 • 61

authored a paper over 1 year ago

Factorized Visual Tokenization and Generation

Paper • 2411.16681 • Published Nov 25, 2024 • 20

authored a paper almost 2 years ago

One Token to Seg Them All: Language Instructed Reasoning Segmentation in Videos

Paper • 2409.19603 • Published Sep 29, 2024 • 19

Pichao Wang

AI & ML interests

Recent Activity

Organizations

wangpichao's activity