OpenAssistant
/

reward-model-deberta-v3-large-v2

Text Classification

Inference Endpoints

Model card Files Files and versions Community

Resources

View closed (1)

what score is high quality

#11 opened about 1 month ago by

Hyperparameters training setting

#10 opened about 1 year ago by

synthetic-instruct-gptj-pairwise pairwise data how to pre-process for train data

#9 opened about 1 year ago by

How to fine tune this model with the Trainer API?

#8 opened about 1 year ago by

How to score a <instruction, input, output> pair?

#7 opened about 1 year ago by

Validation split indices?

#6 opened over 1 year ago by

np.int deprecation issue

#5 opened over 1 year ago by

Question about evaluating this reward model on Anthropic/hh-rlhf

#4 opened over 1 year ago by

Adding `safetensors` variant of this model

#3 opened almost 2 years ago by

How to optimize loss function?

#1 opened almost 2 years ago by