fix rmsnorm init weight bug.

#59
by Shan1990 - opened

Using torch.ones to init rmsnorm weight. And torch.empty gets random weight tensor, which maybe out of float value limits.

@chielo pls review this pr, thx.

@Shan1990 Well. I don't belong to the organization, nor do I have the permission to merge.
But this PR looks good and friendly to initialize chatglm3 for training, etc.
It will be better to have this fix.

@zRzRzRzRzRzRzR pls review this pr, thx.

Knowledge Engineering Group (KEG) & Data Mining at Tsinghua University org

check now

zRzRzRzRzRzRzR changed pull request status to merged

Sign up or log in to comment