fix rmsnorm init weight bug.

#59

by Shan1990 - opened 26 days ago

base: refs/heads/main

←

from: refs/pr/59

Discussion Files changed

-1

Shan1990

26 days ago

Using torch.ones to init rmsnorm weight. And torch.empty gets random weight tensor, which maybe out of float value limits.

fix rmsnorm init weight bug.9d3d7be5

Shan1990

26 days ago

@chielo pls review this pr, thx.

chielo

25 days ago

@Shan1990 Well. I don't belong to the organization, nor do I have the permission to merge.
But this PR looks good and friendly to initialize chatglm3 for training, etc.
It will be better to have this fix.

DuanKW

21 days ago

@zRzRzRzRzRzRzR pls review this pr, thx.

zRzRzRzRzRzRzR

Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University org 13 days ago

check now

zRzRzRzRzRzRzR changed pull request status to merged 13 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment