Is this released checkpoint already finetuned by following the 3-steps outlined in the InstructGPT paper?

by Eamymao - opened Feb 20, 2023

Feb 20, 2023

The readme told us that this model is finetuned on webgpt and prompt_dialogue (version v2), but it doesn't explain the detail of finetuning. Therefore it is a bit confusing whether this model has been finetuned by RLHF steps in InstructGPT and what is the finetuning process. Does anyone know something about this?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment