1 1 3

Yihua Zhang

NormalUhr

https://www.yihua-zhang.com

AI & ML interests

None yet

Recent Activity

published an article 15 days ago

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

published an article about 1 month ago

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

published an article about 1 month ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

View all activity

Organizations

NormalUhr's activity

published an article 15 days ago

Article

DualPipe Explained: A Comprehensive Guide to DualPipe That Anyone Can Understand—Even Without a Distributed Training Background

•

15 days ago

• 2

published an article about 1 month ago

Article

Navigating the RLHF Landscape: From Policy Gradients to PPO, GAE, and DPO for LLM Alignment

•

Feb 11

• 13

published an article about 1 month ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

•

Feb 7

• 71

published an article about 1 month ago

Article

A Review on the Evolvement of Load Balancing Strategy in MoE LLMs: Pitfalls and Lessons

•

Feb 4

• 3

published an article about 1 month ago

Article

From Zero to Reasoning Hero: How DeepSeek-R1 Leverages Reinforcement Learning to Master Complex Reasoning

•

Feb 4

• 12

published an article about 1 month ago

Article

MLA: Redefining KV-Cache Through Low-Rank Projections and On-Demand Decompression

•

Feb 4

• 6

upvoted an article 6 months ago

Article

Optimizing your LLM in production

Sep 15, 2023

• 16

New activity in OPTML-Group/UnlearnCanvas 9 months ago

NonMatchingSplitsSizeError

#2 opened 10 months ago by

yuyang-xue-ed

authored a paper about 1 year ago

UnlearnCanvas: A Stylized Image Dataset to Benchmark Machine Unlearning for Diffusion Models

Paper • 2402.11846 • Published Feb 19, 2024 • 1

updated a dataset about 1 year ago

OPTML-Group/UnlearnCanvas

Viewer • Updated Mar 6, 2024 • 1.76k • 4.13k • 2

liked a dataset about 1 year ago

OPTML-Group/UnlearnCanvas

Viewer • Updated Mar 6, 2024 • 1.76k • 4.13k • 2

liked a Space about 1 year ago

UnlearnCanvas Benchmark

🎨

Filter and compare unlearning methods for benchmarking

liked a Space almost 2 years ago

4.82k

MusicGen

🎵

Generate music from text and melody descriptions