arxiv:2403.13684
whj363636
whj363636
ยท
AI & ML interests
None yet
Recent Activity
upvoted a paper 26 days ago
Uni-OPD: Unifying On-Policy Distillation with a Dual-Perspective Recipe upvoted a paper 3 months ago
MHPO: Modulated Hazard-aware Policy Optimization for Stable Reinforcement Learning