How to train the model

by mike2000 - opened Mar 26

mike2000

Mar 26

Hello, may I ask how you train the model? I have tried to use deepspeed-chat to train gpt2-large as reward model, but the acc is about 67%.
May you share the detail about the model structure and data format?

Ray2333

Owner Mar 26

Hello, may I ask how you train the model? I have tried to use deepspeed-chat to train gpt2-large as reward model, but the acc is about 67%.
May you share the detail about the model structure and data format?

Hello, based on my experience, there are two suggestions you can check: training reward models for harmless-based and helpful-based independently, and using LORA instead of full training.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment