Spaces:

DontPlanToEnd
/

UGI-Leaderboard

Running

Eval Requests

#31

by isr431 - opened Aug 24

Aug 24

Can you please add anthracite-org/magnum-v2-4b, nbeerbower/Lyra-Gutenberg-mistral-nemo-12B and nbeerbower/Gutensuppe-mistral-nemo-12B to the leaderboard? Thanks!

isr431

Aug 25

Would love to see cognitivecomputations/dolphin-2.9.4-gemma2-2b added too!

isr431

Aug 26

Sao10K/L3.1-70B-Euryale-v2.2 was just released!

DontPlanToEnd

Owner Aug 26

Sao10K/L3.1-70B-Euryale-v2.2 was just released!

Added.
From the model page:
"May be less 'uncensored' zero-shot due to removal of c2 samples"
Sadge

isr431

Aug 28

Thanks! Can you also add Sao10K/MN-12B-Lyra-v3 & anthracite-org/magnum-v2-4b? Just a question, what format do you use for testing models (GGUF, AWQ etc.)?

DontPlanToEnd

Owner Aug 29

Thanks! Can you also add Sao10K/MN-12B-Lyra-v3 & anthracite-org/magnum-v2-4b? Just a question, what format do you use for testing models (GGUF, AWQ etc.)?

I test all models as Q4_K_M.gguf both because it's cheaper and most people don't run full models, they run quants. I run the models using an oobabooga RunPod instance, and it doesn't seem like support for those 4bs has been added yet so I'm still waiting on that. I'll test the new Lyra when the right quant has been made 👍

isr431

Aug 31

Can you add a 'Unknown' option in the model sizes? This should only show the closed-sourced models or models with unknown parameter sizes.

isr431

Sep 2

Can you add TheDrummer/Hubble-4B-v1? Thx

isr431

Sep 3

TheDrummer/UnslopNemo-v1-GGUF was just released

isr431

Sep 4

Can you pls add maywell/PiVoT-0.1-Evil-a?

isr431

Sep 8

Sao10K/MN-12B-Lyra-v4 from sao!

isr431

Sep 8

anthracite-org/magnum-v3-9b-chatml-gguf and anthracite-org/magnum-v3-9b-customgemma2-gguf were released, would be interested to see how they compare!

isr431

Sep 9

•

edited Sep 9

nbeerbower/Lyra4-Gutenberg-12B and nbeerbower/gemma2-gutenberg-27B released

isr431

Sep 10

TheDrummer/UnslopNemo-v2-GGUF

isr431

Sep 12

nbeerbower/Lyra4-Gutenberg-12B just released, wasn't released when I posted about it originally

isr431

Sep 13

anthracite-org/magnum-v3-27b-kto

Hampetiudo

about 1 month ago

•

edited about 1 month ago

Hey thank you for your work! Could you please add these models? They recently received support in llama.cpp
https://huggingface.co/allenai/OLMoE-1B-7B-0924
https://huggingface.co/allenai/OLMoE-1B-7B-0924-SFT
https://huggingface.co/allenai/OLMoE-1B-7B-0924-Instruct

isr431

30 days ago

Can you please evaluate this model: nbeerbower/mistral-nemo-gutades-12B

isr431

23 days ago

Hey there! Can you add the newly released Gemini 1.5 Pro 002 to the leaderboard? The previous Gemini models were pretty uncensored, interesting to see how this performs.

isr431

18 days ago

Can you please add bartowski/Mistral-Nemo-Gutenberg-Doppel-12B-GGUF? Thanks

isr431

15 days ago

Can you please add nbeerbower/Lyra4-Gutenberg2-12B and nbeerbower/Gemma2-Gutenberg-Doppel-9B?

isr431

13 days ago

Please also add Tiger Gemma v3

janedoe83

11 days ago

Rocinante v2 (formerly UnslopNemo) has a 10/10 W rating on the leaderboard. But I have run across a lot of unwillingness or disclaimers. It needs reevaluation. The model is frequently updated. https://huggingface.co/TheDrummer/UnslopNemo-v2-GGUF

IronPike

11 days ago

Can you please add ZeusLabs/Chronos-Platinum-72B? Thanks.

DontPlanToEnd

Owner 11 days ago

It needs reevaluation. The model is frequently updated.

It doesn't seem like the files have been updated when I look at the commits.

IronPike

8 days ago

Can you please evaluate this model: smelborp/StellarDong-72b? Thanks.

Hampetiudo

7 days ago

•

edited 7 days ago

Rocinante v2 (formerly UnslopNemo) has a 10/10 W rating on the leaderboard. But I have run across a lot of unwillingness or disclaimers. It needs reevaluation. The model is frequently updated. https://huggingface.co/TheDrummer/UnslopNemo-v2-GGUF

In my tests with this model on text-generation-webui+llama.cpp and sillytavern+koboldcpp, both with temp=1.0, topk=1, ChatML instruct format, and the recommended system prompt, I observed different behaviors. Text-generation-webui responds to all prompts with little to no disclaimers, while sillytavern makes the model behave like a censored one. I think tokenization might be broken on either text-generation-webui or llama.cpp/ llama-cpp-python because it breaks tokens like "<|im_start|>" into multiple tokens instead of a single one.

isr431

7 days ago

Can you add TheDrummer/UnslopNemo-12B-v3-GGUF?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment