Spaces:

open-llm-leaderboard
/

open_llm_leaderboard

Running on CPU Upgrade

App Files Files Community

1143

WizardLM-8x22B Evaluation failed

#823

by llama-anon - opened Jul 4, 2024

Discussion

llama-anon

Jul 4, 2024

https://huggingface.co/datasets/open-llm-leaderboard/requests/blob/main/alpindale/WizardLM-2-8x22B_eval_request_False_float16_Original.json

alozowski

Open LLM Leaderboard org Jul 5, 2024

•

edited Jul 5, 2024

Hi @llama-anon ,

Thanks for providing the request file!
Currently, our cluster is quite full, so we're only evaluating models that can run on a single node. This approach helps us evaluate more models concurrently. However, if there's enough interest from the community, we're open to manually evaluating models that require more than one node, like this one you've submitted

bullerwins

Jul 6, 2024

Another interested user here. Wizard 8x22 is my current go to open source model I use for pretty much everything

placebomancer

Jul 6, 2024

Definitely interested in seeing how Wizardlm-2 8x22b stacks up. It seems vastly better than the other fine-tunes of 8x22b, including Mistral's own. I think the only reason it hasn't gotten more attention is that it was never put on LMSYS Arena. It's been in the top ten most used models on OpenRouter for awhile now and I think it would have a solid chance of topping the leaderboard.

smcleod

Jul 6, 2024

Would be good to get this added, it's been out quite some time but people really rave about it.

isr431

Jul 6, 2024

Very capable finetune by Microsoft, would love to it added (and potentially the 7b variant)!

freegheist

Jul 6, 2024

Strongest FOSS model/finetune, except for coding. Crazy this isn't on the leaderboard. Yes it needs a lot of VRAM. but it would be really good to showcase the best of open source IMO.

SnailInf1

Jul 6, 2024

WizardLM-2-8x22b is one of the most powerful open-source language models. It would be really great to see how it performs compared to other open-source large language models on the Open-LLM-Leaderboard.

sloopbun

Jul 6, 2024

Voting WizardLM-2-8x22b

Duckycode

Jul 6, 2024

Would love to see Wizard ranked! It’d be really good to see how it compares to other wizard and non-wizard models.

Novetteus

Jul 6, 2024

•

edited Jul 6, 2024

Would also love to see it ranked. Was a fantastic model when I tried online hostings of it. Still worthwhile on the lobotomized local usage my setup can get out of it, which was a pleasant surprise.

SomeOddCodeGuy

Jul 6, 2024

I definitely have an interest in seeing the Wizard benchmarks. This topic has come up a few times on LocalLlama, but none of us have really known how to get it up here and just assumed it wouldn't happen.

I think you'd make a few people pretty happy if you were able to squeeze this one in.

OpenLeecher

Jul 6, 2024

Wiz 8x22 5bpw is still my daily driver. It's writing contextual awareness and fringe knowledge is still unmatched IMO. Would love to see how it stacks up against the other top dogs.

ricced

Jul 6, 2024

Voting WizardLM-2-8x22b

pedalnomica

Jul 6, 2024

Add my vote!

MrVodnik

Jul 6, 2024

Voting WizardLM-2-8x22b

qwp4w3hyb

Jul 6, 2024

Vote from me as well !

mahirzukic2

Jul 6, 2024

I would like to see it too.

pszemraj

Jul 7, 2024

•

edited Jul 7, 2024

jsfs11

Jul 7, 2024

Very interested to see it Benchmarked also +1

nichedreams

Jul 7, 2024

Another vote from me

Tom-Neverwinter

Jul 7, 2024

Id rather we run the highest quality models get the baseline going then proceed to quantity as the goal is to top score as soon as possible so we stop the plateau

get wizardlm in

alozowski

Open LLM Leaderboard org Jul 8, 2024

Hi everyone,

Thanks for your messages and activity! Let's start WizardLM-2-8x22B evaluation! 🚀

bullerwins

Jul 8, 2024

Hi everyone,

Thanks for your messages and activity! Let's start WizardLM-2-8x22B evaluation! 🚀

great news! I'm really curious how it stacks up. I'm also glad the feedback was heard.

alozowski

Open LLM Leaderboard org Jul 15, 2024

•

edited Jul 15, 2024

Thanks everyone for your activity and patience! WizardLM-2-8x22B is now 8th on the Leaderboard with an average score of 32.61!

alozowski changed discussion status to closed Jul 15, 2024

SnailInf1

Jul 18, 2024

Great - thank you for evaluation!

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment