How to train the model

#1
by mike2000 - opened

Hello, may I ask how you train the model? I have tried to use deepspeed-chat to train gpt2-large as reward model, but the acc is about 67%.
May you share the detail about the model structure and data format?

Owner

Hello, may I ask how you train the model? I have tried to use deepspeed-chat to train gpt2-large as reward model, but the acc is about 67%.
May you share the detail about the model structure and data format?

Hello, based on my experience, there are two suggestions you can check: training reward models for harmless-based and helpful-based independently, and using LORA instead of full training.

Sign up or log in to comment