YzZ-George
/

DeepSpeed-Chat-OPT-1.3B-3-3-3datasets

Update README.md

1667ad5 about 1 year ago

236 Bytes

metadata

license: apache-2.0

We train OPT-1.3B using three datasets: Dahoas/rm-static, Dahoas/full-hh-rlhf, and yitingxie/rlhf-reward-datasets.

Dahoas/synthetic-instruct-gptj-pairwise is not used because of the adsence of test dataset.