shijie's picture

2

shijie

wodaxia

AI & ML interests

None yet

Recent Activity

upvoted an article about 1 month ago

The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU

upvoted an article 7 months ago

Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO)

View all activity

Organizations

None yet

upvoted an article about 1 month ago

Article

The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU

Weyaxi

•

Jan 2

• 23

upvoted an article 7 months ago

Article

Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO)

ariG23498

•

Jan 19, 2025

• 53