Wei Liu's picture

Wei Liu

PeterV09

·

https://vpeterv.github.io

AI & ML interests

Machine Learning, Natural Language Processing

Recent Activity

upvoted a paper 5 days ago

Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

updated a model 8 days ago

hkustnlpvlm/qwenvl-warmup-ckpt445-public

published a model 8 days ago

hkustnlpvlm/qwenvl-warmup-ckpt445-public

View all activity

Organizations

PeterV09's activity

upvoted a paper 5 days ago

Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

Paper • 2503.24377 • Published 6 days ago • 17

upvoted 3 collections 12 days ago

M-STAR

Resources of M-STAR (Multimodal Self-Evolving Training for Reasoning) https://mstar-lmm.github.io/ • 2 items • Updated Dec 25, 2024 • 4

SimpleRL

The collection for the Project "Simple Reinforcement Learning for Reasoning" • 2 items • Updated Feb 19 • 6

SimpleRL-Zoo

The collection for the Paper "SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild" • 12 items • Updated 5 days ago • 6

upvoted a paper 13 days ago

SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild

Paper • 2503.18892 • Published 13 days ago • 28

upvoted 3 papers about 1 month ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 105

Language Models can Self-Improve at State-Value Estimation for Better Search

Paper • 2503.02878 • Published Mar 4 • 9

Predictive Data Selection: The Data That Predicts Is the Data That Teaches

Paper • 2503.00808 • Published Mar 2 • 57

upvoted 3 papers about 2 months ago

MoM: Linear Sequence Modeling with Mixture-of-Memories

Paper • 2502.13685 • Published Feb 19 • 34

LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid

Paper • 2502.07563 • Published Feb 11 • 24

CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction

Paper • 2502.07316 • Published Feb 11 • 48

upvoted a paper 2 months ago

Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback

Paper • 2501.12895 • Published Jan 22 • 60

upvoted 7 papers 3 months ago

Sa2VA: Marrying SAM2 with LLaVA for Dense Grounded Understanding of Images and Videos

Paper • 2501.04001 • Published Jan 7 • 47

Cosmos World Foundation Model Platform for Physical AI

Paper • 2501.03575 • Published Jan 7 • 78

REINFORCE++: A Simple and Efficient Approach for Aligning Large Language Models

Paper • 2501.03262 • Published Jan 4 • 99

PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models

Paper • 2501.03124 • Published Jan 6 • 14

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published Dec 27, 2024 • 89

Diving into Self-Evolving Training for Multimodal Reasoning

Paper • 2412.17451 • Published Dec 23, 2024 • 44

B-STaR: Monitoring and Balancing Exploration and Exploitation in Self-Taught Reasoners

Paper • 2412.17256 • Published Dec 23, 2024 • 48

upvoted a paper 6 months ago

CLIP-MoE: Towards Building Mixture of Experts for CLIP with Diversified Multiplet Upcycling

Paper • 2409.19291 • Published Sep 28, 2024 • 19