Feng's picture

1 3

Feng

Yunzhen

https://fengyzpku.github.io/

fengyzpku

AI & ML interests

None yet

Organizations

None yet

Yunzhen's activity

upvoted 2 papers 2 months ago

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Paper • 2502.05163 • Published Feb 7 • 22

PILAF: Optimal Human Preference Sampling for Reward Modeling

Paper • 2502.04270 • Published Feb 6 • 11

upvoted a paper about 1 year ago

Teaching Large Language Models to Reason with Reinforcement Learning

Paper • 2403.04642 • Published Mar 7, 2024 • 51