How to train the model
#1
by
mike2000
- opened
Hello, may I ask how you train the model? I have tried to use deepspeed-chat to train gpt2-large as reward model, but the acc is about 67%.
May you share the detail about the model structure and data format?
Hello, may I ask how you train the model? I have tried to use deepspeed-chat to train gpt2-large as reward model, but the acc is about 67%.
May you share the detail about the model structure and data format?
Hello, based on my experience, there are two suggestions you can check: training reward models for harmless-based and helpful-based independently, and using LORA instead of full training.