Text Generation
Transformers
Safetensors
English
deberta
reward_model
reward-model
RLHF
evaluation
llm
instruction
reranking
Inference Endpoints
yuchenlin commited on
Commit
d6bc040
1 Parent(s): 36f2044

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -30,6 +30,10 @@ pipeline_tag: text-generation
30
  - Space Demo: [https://huggingface.co/spaces/llm-blender/LLM-Blender](https://huggingface.co/spaces/llm-blender/LLM-Blender)
31
 
32
 
 
 
 
 
33
  ## Introduction
34
 
35
  Pairwise Reward Model (PairRM) takes an instruction and a **pair** of output candidates as the input,
 
30
  - Space Demo: [https://huggingface.co/spaces/llm-blender/LLM-Blender](https://huggingface.co/spaces/llm-blender/LLM-Blender)
31
 
32
 
33
+ ## News
34
+
35
+ - Check out our results on AlpacaEval leaderboard: [Twitter](https://x.com/billyuchenlin/status/1732198787354067380?s=20) [Leaderboard](https://tatsu-lab.github.io/alpaca_eval/)
36
+
37
  ## Introduction
38
 
39
  Pairwise Reward Model (PairRM) takes an instruction and a **pair** of output candidates as the input,