yhyu13
commited on
Commit
•
97d7a28
1
Parent(s):
b07ed04
Add chatgpt alpacaeval result
Browse files- alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/alpaca_eval_log.txt +0 -0
- alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/annotation_chatgpt_fn.json +0 -0
- alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/leaderboard.csv +15 -0
- alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/model_outputs.json +0 -0
- alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/reference_outputs.json +0 -0
alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/alpaca_eval_log.txt
ADDED
The diff for this file is too large to render.
See raw diff
|
|
alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/annotation_chatgpt_fn.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/leaderboard.csv
ADDED
@@ -0,0 +1,15 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
,win_rate,standard_error,n_wins,n_wins_base,n_draws,n_total,mode,avg_length
|
2 |
+
gpt4,73.7888198757764,1.5359801545073597,588,205,12,805,minimal,1365
|
3 |
+
claude,70.37267080745342,1.599519507147828,562,234,9,805,minimal,1082
|
4 |
+
chatgpt,66.08695652173913,1.6626479994330317,529,270,6,805,minimal,811
|
5 |
+
wizardlm-13b,65.15527950310559,1.670034107787565,520,276,9,805,minimal,985
|
6 |
+
vicuna-13b,64.09937888198758,1.6895185863153146,515,288,2,805,minimal,1037
|
7 |
+
guanaco-65b,62.36024844720497,1.7086348811605765,502,303,0,805,minimal,1249
|
8 |
+
oasst-rlhf-llama-33b,62.0496894409938,1.7080028976103514,498,304,3,805,minimal,1079
|
9 |
+
alpaca-farm-ppo-human,60.24844720496895,1.7169496733548772,481,316,8,805,minimal,803
|
10 |
+
falcon-40b-instruct,56.52173913043478,1.7438750520312944,453,348,4,805,minimal,662
|
11 |
+
phi-2-alpaca-gpt4,54.22885572139303,1.7537289814567856,434,366,4,804,community,1138
|
12 |
+
text_davinci_003,50.0,0.0,0,0,805,805,minimal,307
|
13 |
+
alpaca-7b,45.21739130434783,1.7375846781579476,356,433,16,805,minimal,396
|
14 |
+
phi-2,43.7888198757764,1.7441776958694664,350,450,5,805,community,924
|
15 |
+
text_davinci_001,28.07453416149068,1.5602183426587484,216,569,20,805,minimal,296
|
alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/model_outputs.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|
alpaca_eval/chatgpt_fn_--phi-2-alpaca-gpt4/reference_outputs.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|