Shuai Wang

Shuaiii

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

upvoted a paper 5 days ago

CoRe^2: Collect, Reflect and Refine to Generate Better and Faster

upvoted a paper 5 days ago

Transformers without Normalization

View all activity

Organizations

None yet

Shuaiii's activity

upvoted a paper 1 day ago

MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark

Paper • 2409.02813 • Published Sep 4, 2024 • 30

upvoted 2 papers 5 days ago

CoRe^2: Collect, Reflect and Refine to Generate Better and Faster

Paper • 2503.09662 • Published 7 days ago • 29

Transformers without Normalization

Paper • 2503.10622 • Published 6 days ago • 126

upvoted a paper 7 days ago

MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

Paper • 2503.05978 • Published 12 days ago • 32

upvoted a paper 12 days ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 13 days ago • 96

upvoted a paper 26 days ago

How Much Knowledge Can You Pack into a LoRA Adapter without Harming LLM?

Paper • 2502.14502 • Published 27 days ago • 85

upvoted a paper 27 days ago

Qwen2.5-VL Technical Report

Paper • 2502.13923 • Published 28 days ago • 166

upvoted a paper 29 days ago

Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention

Paper • 2502.11089 • Published about 1 month ago • 145

upvoted a paper 30 days ago

Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model

Paper • 2502.10248 • Published Feb 14 • 51

upvoted a paper about 1 month ago

Analyze Feature Flow to Enhance Interpretation and Steering in Language Models

Paper • 2502.03032 • Published Feb 5 • 58

upvoted a paper about 2 months ago

SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training

Paper • 2501.17161 • Published Jan 28 • 109

upvoted a collection about 2 months ago

Qwen2.5-VL

Collection

Vision-language model series based on Qwen2.5 • 8 items • Updated 23 days ago • 400

upvoted 4 papers about 2 months ago

Humanity's Last Exam

Paper • 2501.14249 • Published Jan 24 • 67

Kimi k1.5: Scaling Reinforcement Learning with LLMs

Paper • 2501.12599 • Published Jan 22 • 104

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Paper • 2501.12895 • Published Jan 22 • 57

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 352

upvoted 2 papers 2 months ago

Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16 • 70

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 100

liked 2 models 3 months ago

Qwen/QVQ-72B-Preview

Image-Text-to-Text • Updated Jan 12 • 172k • • 566

deepseek-ai/DeepSeek-V3-Base

Updated 23 days ago • 623k • 1.6k