shijie
wodaxia
AI & ML interests
None yet
Recent Activity
upvoted an article about 1 month ago
The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU upvoted an article 7 months ago
Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO)Organizations
None yet