Hyperparameters?
#23
by
ekurtulus
- opened
What is the dataset size and PPO hyperparameters?
The dataset is here: https://huggingface.co/datasets/berkeley-nest/Nectar with 183K prompts and 7 responses each. PPO hyperparameters are similar to the trlx repo here: https://github.com/CarperAI/trlx, except that we changed the learning rate to 1e-7. We'll open source the paper and code base soon!