Hanchi Sun
MasterGodzilla
·
AI & ML interests
None yet
Recent Activity
published
a model
6 days ago
MasterGodzilla/Qwen2.5-0.5B-Open-R1-GRPO
published
a model
6 days ago
MasterGodzilla/Qwen2.5-1.5B-Open-R1-GRPO
upvoted
a
paper
3 months ago
HelpSteer2-Preference: Complementing Ratings with Preferences
Organizations
None yet
MasterGodzilla's activity
Why not use the Plackett-Luce Model version of DPO when K=4 ranked responses are present?
#18 opened over 1 year ago
by
MasterGodzilla
Why not use the Plackett-Luce Model version of DPO when K=4 ranked responses are present?
#18 opened over 1 year ago
by
MasterGodzilla