Yihua Zhang

NormalUhr

AI & ML interests

None yet

Recent Activity

Organizations

OPTML Group @ MSU's profile picture

NormalUhr's activity

published an article 15 days ago
view article
Article

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

By NormalUhr
2
published an article about 1 month ago
view article
Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

By NormalUhr
13
published an article about 1 month ago
view article
Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

By NormalUhr
71
published an article about 1 month ago
view article
Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

By NormalUhr
3
published an article about 1 month ago
view article
Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

By NormalUhr
12
published an article about 1 month ago
view article
Article

MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression

By NormalUhr
6
upvoted an article 6 months ago
view article
Article

Optimizing your LLM in production

16
New activity in OPTML-Group/UnlearnCanvas 9 months ago