yuchenlin commited on
Commit
96cc13f
1 Parent(s): e066c87

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -40,7 +40,7 @@ Apart from that, one can also use PairRM to further align instruction-tuned LLMs
40
 
41
  Unlike the other RMs that encode and score each candidate respectively,
42
  PairRM takes a pair of candidates and compares them side-by-side to indentify the subtle differences between them.
43
- Also, PairRM is based on DeBERTa-large, and thus it is super efficient: 0.4B.
44
  We trained PairRM on a diverse collection of human preference datasets such as UltraFeedback, HH-RLHF, chatbot-arena, etc.
45
  PairRM is part of the LLM-Blender project (ACL 2023). Please see our paper linked above to know more.
46
 
 
40
 
41
  Unlike the other RMs that encode and score each candidate respectively,
42
  PairRM takes a pair of candidates and compares them side-by-side to indentify the subtle differences between them.
43
+ Also, PairRM is based on [`microsoft/deberta-v3-large`](https://huggingface.co/microsoft/deberta-v3-large), and thus it is super efficient: 0.4B.
44
  We trained PairRM on a diverse collection of human preference datasets such as UltraFeedback, HH-RLHF, chatbot-arena, etc.
45
  PairRM is part of the LLM-Blender project (ACL 2023). Please see our paper linked above to know more.
46