Weihao Yu's picture

Weihao Yu

whyu

·

https://scholar.google.com/citations?user=LYxjt1QAAAAJ

AI & ML interests

Computer Vision, NLP and AI

Recent Activity

upvoted a paper 28 days ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

upvoted a paper about 1 month ago

Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

updated a Space 4 months ago

whyu/MambaOut

View all activity

Organizations

whyu's activity

upvoted a paper 28 days ago

Long-Context Autoregressive Video Modeling with Next-Frame Prediction

Paper • 2503.19325 • Published 29 days ago • 72

upvoted a paper about 1 month ago

Painting with Words: Elevating Detailed Image Captioning with Benchmark and Alignment Learning

Paper • 2503.07906 • Published Mar 10 • 4

upvoted 2 papers 5 months ago

ROICtrl: Boosting Instance Control for Visual Generation

Paper • 2411.17949 • Published Nov 27, 2024 • 88

OminiControl: Minimal and Universal Control for Diffusion Transformer

Paper • 2411.15098 • Published Nov 22, 2024 • 60

upvoted a paper 6 months ago

TemporalBench: Benchmarking Fine-grained Temporal Understanding for Multimodal Video Models

Paper • 2410.10818 • Published Oct 14, 2024 • 17

upvoted 4 papers 7 months ago

Differential Transformer

Paper • 2410.05258 • Published Oct 7, 2024 • 178

Attention Prompting on Image for Large Vision-Language Models

Paper • 2409.17143 • Published Sep 25, 2024 • 7

FlashSplat: 2D to 3D Gaussian Splatting Segmentation Solved Optimally

Paper • 2409.08270 • Published Sep 12, 2024 • 12

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published Sep 11, 2024 • 21

upvoted a paper 8 months ago

LinFusion: 1 GPU, 1 Minute, 16K Image

Paper • 2409.02097 • Published Sep 3, 2024 • 35

upvoted 2 papers 9 months ago

MM-Vet v2: A Challenging Benchmark to Evaluate Large Multimodal Models for Integrated Capabilities

Paper • 2408.00765 • Published Aug 1, 2024 • 14

KAN or MLP: A Fairer Comparison

Paper • 2407.16674 • Published Jul 23, 2024 • 44

upvoted 2 papers 10 months ago

Compositional Video Generation as Flow Equalization

Paper • 2407.06182 • Published Jun 10, 2024 • 14

Video-Infinity: Distributed Long Video Generation

Paper • 2406.16260 • Published Jun 24, 2024 • 30

upvoted a paper over 1 year ago

Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 119