Text Generation
Transformers
PyTorch
code
gpt2
custom_code
Eval Results
text-generation-inference

I want add a program language of santacoder, use pretrain_gpt_1B_santacoder.sh scipt file

#34
by iawen - opened

I want add a program language of santacoder, use pretrain_gpt_1B_santacoder.sh scipt file, but the effect is not ideal.
I wonder how much the loss function is reduced and the effect will be better? Can you provide some information or guidance, thank you

and how do model_optim_rng.pt files convert to pytorch_model.bin? i use transformers and Megatron-LM, get the error: KeyError: 'self_attention.query'

This comment has been hidden

Sign up or log in to comment