Dongfu Jiang
commited on
Commit
•
8d9ead8
1
Parent(s):
00b7e60
Update README.md
Browse files
README.md
CHANGED
@@ -80,7 +80,8 @@ We test the pairwise comparison on
|
|
80 |
| PairRM | **84.75** | 84.48 | **80.33** | **90.7** | **84.62** | **59** |
|
81 |
| GPT -4-0613 | 91.53 | 93.1 | 85.25 | 83.72 | 88.69 | 63.87 |
|
82 |
|
83 |
-
While PairRM is a extremely small model (0.4B) based on deberta, the pairwise comparison aggrement performance approches GPT-4's performance
|
|
|
84 |
Two reasons to attribute:
|
85 |
- Our PairRM specically designed model arch for pairwise comparison through bidirectional attention (See paper for more details)
|
86 |
- The high-quality and large-scale human preference annotation data it was train on (see tags for list)
|
|
|
80 |
| PairRM | **84.75** | 84.48 | **80.33** | **90.7** | **84.62** | **59** |
|
81 |
| GPT -4-0613 | 91.53 | 93.1 | 85.25 | 83.72 | 88.69 | 63.87 |
|
82 |
|
83 |
+
**While PairRM is a extremely small model (0.4B) based on deberta, the pairwise comparison aggrement performance approches GPT-4's performance!**
|
84 |
+
|
85 |
Two reasons to attribute:
|
86 |
- Our PairRM specically designed model arch for pairwise comparison through bidirectional attention (See paper for more details)
|
87 |
- The high-quality and large-scale human preference annotation data it was train on (see tags for list)
|