The output of the reward model is a two-dimensional vector, what does each dimension mean?
#3 opened 5 months ago
by
Lily912
More details on training data for reward model
#2 opened 7 months ago
by
reign12
Where is the input file of augment_oasst ?
#1 opened 9 months ago
by
LetsJumP