I want add a program language of santacoder, use pretrain_gpt_1B_santacoder.sh scipt file
#34
by
iawen
- opened
I want add a program language of santacoder, use pretrain_gpt_1B_santacoder.sh scipt file, but the effect is not ideal.
I wonder how much the loss function is reduced and the effect will be better? Can you provide some information or guidance, thank you
and how do model_optim_rng.pt files convert to pytorch_model.bin? i use transformers and Megatron-LM, get the error: KeyError: 'self_attention.query'
This comment has been hidden