leaderboard / data /score.csv
Jae-Won Chung
Add Salesforce xgen scores
9bd3aab
raw
history blame
1.92 kB
model,arc,hellaswag,truthfulqa
BAIR/koala-13b,52.901023890784984,77.54431388169687,50.091065219059125
BAIR/koala-7b,47.098976109215016,73.70045807608047,45.997635958147875
lmsys/vicuna-7B,53.49829351535836,77.53435570603465,48.997614637055264
metaai/llama-13B,56.31399317406144,80.8603863772157,39.90298264801161
tatsu-lab/alpaca-7B,52.64505119453925,76.90699063931487,39.552770976749336
OpenAssistant/oasst-sft-1-pythia-12b,45.563139931740615,69.92630950009958,39.1893543136912
databricks/dolly-v2-12b,42.15017064846416,71.82832105158336,33.37136000408915
h2oai/h2ogpt-gm-oasst1-en-2048-open-llama-7b-preview-300bt-v2,36.86006825938566,61.551483768173675,37.9421602393762
lmsys/fastchat-t5-3b-v1.0,35.92150170648464,46.355307707627965,48.787610045893985
nomic-ai/gpt4all-13b-snoozy,56.058020477815695,78.68950408285203,48.35948664919701
openaccess-ai-collective/manticore-13b-chat-pyg,58.703071672354945,81.95578570005975,48.86009773651491
lmsys/vicuna-13B,52.901023890784984,80.12348137821151,51.81653185716687
metaai/llama-7B,51.10921501706485,77.74347739494125,34.0786227034917
StabilityAI/stablelm-tuned-alpha-7b,31.91126279863481,53.59490141406095,40.22458364155103
project-baize/baize-v2-7B,48.4641638225256,75.00497908783112,41.66264911575524
FreedomIntelligence/phoenix-inst-chat-7b,44.965870307167236,63.2244572794264,47.084372288512725
camel-ai/CAMEL-13B-Combined-Data,55.54607508532423,79.29695279824736,47.33219922854091
Neutralzz/BiLLa-7B-SFT,27.730375426621162,26.04062935670185,49.045640164325754
togethercomputer/RedPajama-INCITE-7B-Chat,42.15017064846416,70.8424616610237,36.10055989611241
metaai/Llama-2-7b-chat-hf,52.73037542662116,78.48038239394542,45.32519554457334
metaai/Llama-2-13b-chat-hf,59.129692832764505,81.94582752439753,43.9572591900371
RWKV/rwkv-raven-7b,39.419795221843,66.45090619398526,38.544263035922036
Salesforce/xgen-7b-8k-inst,46.67235494880546,74.84564827723561,41.892174853317556