li
daner
·
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 2 months ago
The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits
Organizations
None yet
daner's activity
Your request to access this repo has been rejected by the repo's authors.
1
#17 opened 4 months ago
by
daner
此时不应降低学习率,warmup 等超参,而是应该放大到Pretrain 规模
3
#2 opened over 1 year ago
by
daner
此时不应降低学习率,warmup 等超参,而是应该放大到Pretrain 规模
3
#2 opened over 1 year ago
by
daner