fireballoon
fireballoon
AI & ML interests
None yet
Organizations
None yet
fireballoon's activity
Adding `safetensors` variant of this model
#5 opened 2 months ago
by
SFconvertbot
loss震荡幅度比较大是正常的嘛,loss是在3个epoch的哪个时候开始下降并保持稳定的呢
3
#13 opened over 1 year ago
by
Aibet
请问deepspeed zero3的参数是怎么配置的
1
#14 opened over 1 year ago
by
Aibet
可以提供一下leetcode的能跑通的数据或者处理code 嘛?谢谢
2
#8 opened over 1 year ago
by
Aibet
可以提供一下训练代码吗?
11
#5 opened over 1 year ago
by
puppet1988
请问有跑分的代码吗
3
#7 opened over 1 year ago
by
endNone
how to fix the "ValueError: Tokenizer class LlamaTokenizer does not exist or is not currently imported." while my tokenizer_config.json file is "tokenizer_class": "LlamaTokenizer", already .
2
#6 opened over 1 year ago
by
lishuangxiu-nuannuan
训练的时候loss为0
1
#5 opened over 1 year ago
by
deerluffy
不支持Vicuna-v1.3?
1
#4 opened over 1 year ago
by
acupofespresso
The problem of pad_token
1
#10 opened over 1 year ago
by
kang1
About baichuan-13b model conversion
1
#4 opened over 1 year ago
by
greatzane
Baichuan-13B please!
2
#3 opened over 1 year ago
by
greatzane
模型效果超出预期,很棒!!
1
#8 opened over 1 year ago
by
oscar325
请问这个sft用到了哪些数据,总共是多少量级?
1
#7 opened over 1 year ago
by
Kuaixueshiqing
请问验证过转完的llama格式权重能够用于sft吗
4
#1 opened over 1 year ago
by
nnz
关于模型中文语言表现
1
#6 opened over 1 year ago
by
reedhs
容易用英文回答,即使明确提示使用中文
1
#2 opened over 1 year ago
by
huashiyiqike
fast tokenizer问题
4
#3 opened over 1 year ago
by
JaheimLee