sfairXC
/

FsfairX-Zephyr-Chat-v0.1

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

hendrydong commited on Apr 22, 2024

Commit

aa409da

·

verified ·

1 Parent(s): f5e02b8

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -6,6 +6,8 @@ We perform GSHF algorithm on SFT baseline. The external signals include (1) Rewa
 **We obtain 35.95% win-rate (34.79% LC win-rate) on Alpaca Eval v2.** The win-rate of the base model is only 4.63%.
 We have demonstrated the significant potential of the iterative RLHF algorithm for LLMs to deliver appropriate and well-structured responses,
 even without any external responses.

 **We obtain 35.95% win-rate (34.79% LC win-rate) on Alpaca Eval v2.** The win-rate of the base model is only 4.63%.
+For MT-bench, it obtained about 7.5, where the base model is only 5.3.
 We have demonstrated the significant potential of the iterative RLHF algorithm for LLMs to deliver appropriate and well-structured responses,
 even without any external responses.