Shaobai Jiang
shaobaij
ยท
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
17 days ago
Test-Time Preference Optimization: On-the-Fly Alignment via Iterative
Textual Feedback
upvoted
a
paper
21 days ago
START: Self-taught Reasoner with Tools
upvoted
a
paper
21 days ago
Light-R1: Curriculum SFT, DPO and RL for Long COT from Scratch and
Beyond
Organizations
None yet
models
None public yet
datasets
None public yet