gentaiscool commited on
Commit
75a689a
1 Parent(s): 9342c3c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +36 -0
README.md ADDED
@@ -0,0 +1,36 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ datasets:
3
+ - natolambert/skywork-preferences-80k-v0.1-cleaned
4
+ - allenai/preference-test-sets
5
+ ---
6
+
7
+ # MetaMetrics-RM-v1.0
8
+
9
+ + **Authors** [Genta Indra Winata](https://gentawinata.com/), [David Anugraha](https://weixiongust.github.io/WeiXiongUST/index.html), [Lucky Susanto](https://tengyangxie.github.io/), [Garry Kuwanto](https://hanzhaoml.github.io/), [Derry Tanti Wijaya](https://tongzhang-ml.org/)
10
+ + **Paper**: https://arxiv.org/abs/2406.12845
11
+ + **Model**: [meta-metrics/MetaMetrics-RM-v1.0](https://huggingface.co/meta-metrics/MetaMetrics-RM-v1.0)
12
+ + **Dataset**:
13
+ - [natolambert/skywork-preferences-80k-v0.1-cleaned](https://huggingface.co/datasets/natolambert/skywork-preferences-80k-v0.1-cleaned)
14
+ - [allenai/preference-test-sets](https://huggingface.co/datasets/allenai/preference-test-sets)
15
+ + **Code Repository:** https://github.com/meta-metrics/metametrics
16
+
17
+ ## RewardBench LeaderBoard
18
+
19
+ | Model | Score | Chat | Chat Hard | Safety | Reasoning |
20
+ |:--------------------------------------------------------------------------------|:-----------------------------------------------------------------------|:-----:|:-----|:----------|:-------|:----------|:-----------------------|:------------------------|
21
+ | nvidia/Llama-3.1-Nemotron-70B-Reward | **94.1** | 97.5 | 85.7 | **95.1** | 98.1 |
22
+ | meta-metrics/MetaMetrics-RM-v1.0 | 93.5 | **98.9** | 86.2 | 90.7 | **98.2** |
23
+ | SF-Foundation/TextEval-Llama3.1-70B | 93.5 | 94.1 | **90.1** | 93.2 | 96.4 |
24
+ | RLHFlow/ArmoRM-Llama3-8B-v0.1 | 90.4 | 96.9 | 76.8 | 90.5 | 97.3 |
25
+
26
+ ## Citation
27
+
28
+ If you find this work useful for your research, please consider citing:
29
+ ```
30
+ @article{winata2024metametrics,
31
+ title={MetaMetrics: Calibrating Metrics For Generation Tasks Using Human Preferences},
32
+ author={Winata, Genta Indra and Anugraha, David and Susanto, Lucky and Kuwanto, Garry and Wijaya, Derry Tanti},
33
+ journal={arXiv preprint arXiv:2410.02381},
34
+ year={2024}
35
+ }
36
+ ```