Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs
Paper
•
2410.18451
•
Published
•
15
Skywork reward model series
Note A new version of our 27B reward model trained on Skywork-Reward-Preference-80K-v0.2, the decontaminated version of Skywork-Reward-Preference-80K-v0.1
Note A new version of our 8B reward model trained on Skywork-Reward-Preference-80K-v0.2, the decontaminated version of Skywork-Reward-Preference-80K-v0.1