natolambert commited on
Commit
87b1f9b
1 Parent(s): 874c0c9

update about

Browse files
Files changed (1) hide show
  1. src/md.py +1 -0
src/md.py CHANGED
@@ -12,6 +12,7 @@ We average over 4 core sections (per prompt weighting):
12
 
13
  For Reasoning, we increase the weight of the PRM-Math subset so code and math abilities are weighed equally in the final number, rather than increasing the relevance of code.
14
  We add a final column, **Prior Sets** -- includes the test sets ([anthropic_helpful](https://huggingface.co/datasets/Anthropic/hh-rlhf), [anthropic_hhh](https://huggingface.co/datasets/HuggingFaceH4/hhh_alignment), [shp](https://huggingface.co/datasets/stanfordnlp/SHP), [summarize](https://huggingface.co/datasets/openai/summarize_from_feedback))
 
15
 
16
  Once all subsets weighted averages are achieved, the final RewardBench score is the average across the 5 subset scores.
17
 
 
12
 
13
  For Reasoning, we increase the weight of the PRM-Math subset so code and math abilities are weighed equally in the final number, rather than increasing the relevance of code.
14
  We add a final column, **Prior Sets** -- includes the test sets ([anthropic_helpful](https://huggingface.co/datasets/Anthropic/hh-rlhf), [anthropic_hhh](https://huggingface.co/datasets/HuggingFaceH4/hhh_alignment), [shp](https://huggingface.co/datasets/stanfordnlp/SHP), [summarize](https://huggingface.co/datasets/openai/summarize_from_feedback))
15
+ Prior sets is weighted 0.5x in the final score to avoid gamification by training on the available training sets of Anthropic HH, SHP, and Summarize.
16
 
17
  Once all subsets weighted averages are achieved, the final RewardBench score is the average across the 5 subset scores.
18