用Deepspeed读取模型进行微调,会把内存耗尽,报错 Step 1 exited with non-zero status 247 exits with return code = -9
#3 opened 8 days ago
by
x-lin
Adding `safetensors` variant of this model
#2 opened 3 months ago
by
SFconvertbot

关于token裁剪
1
#1 opened 3 months ago
by
YeungNLP