The output of the reward model is a two-dimensional vector, what does each dimension mean?
#3 opened 6 months ago
by
Lily912
More details on training data for reward model
#2 opened 8 months ago
by
reign12
Where is the input file of augment_oasst ?
#1 opened 9 months ago
by
LetsJumP