Should we use the 5th dimension of the output only?

by liangqxx - opened Oct 18, 2024

Oct 18, 2024

Hi Zhilin,

As the paper mentioned, this reward model only trained on helpfulness of HelpSteer2. Should we use the 5th dimension of the output?

Thank you!

NVIDIA org Oct 19, 2024

Yes this is correct (5th as in index 4 since we start with the zeroth index).

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment