Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
💬 Discussion thread: Model contamination techniques 💬
pinned
33
#472 opened 6 months ago
by
clefourrier
Future feature: system prompt and chat support
pinned
21
#459 opened 6 months ago
by
clefourrier
💬 Discussion thread: Model scores and model performances 💬
pinned
70
#265 opened 9 months ago
by
clefourrier
💎 Resources and community initiatives around the Leaderboard! 💎
pinned#174 opened 10 months ago
by
clefourrier
Model "Amu/dpo-qlora-Qwen1.5-0.5B-Chat-xtuner" was not found or misconfigured on the hub!
3
#772 opened about 13 hours ago
by
Amu
Model "X" was not found or misconfigured on the hub!
#771 opened 1 day ago
by
picAIso
GSM8K Evaluation has a serious bug/oversight, that is negatively impacting score of all Llama 3 models. Please consider updating to the latest commit of lm-evaluation-harness which fixes it.
#770 opened 1 day ago
by
ArkaAbacus
models submitted for eval are finished, but do not appear on the leaderboard
#768 opened 2 days ago
by
giannisan
FLAG: saltlux/luxia-21.4b-alignment-v1.2 GSM8k v1.0 to v1.2 29% GSM Tests contamination
3
#767 opened 2 days ago
by
fblgit
loading_from_contents
14
#766 opened 4 days ago
by
clefourrier
gradientai/Llama-3-70B-Instruct-Gradient-1048k FAILED, can I investigate the logs?
2
#765 opened 4 days ago
by
leo-pekelis-gradient
Feature request: Add toggle to only show models with linked dataset
1
#763 opened 5 days ago
by
ThiloteE
Feature request: Hide models with insufficient model card from default view in leaderboard
4
#762 opened 5 days ago
by
ThiloteE
Discussion: naming pattern to converge on to better identify fine-tunes
7
#761 opened 5 days ago
by
ThiloteE
Model disappeared from app(I can't find the dataset related with the model either)
5
#759 opened 6 days ago
by
kamilmuratyilmaz
reclassify some ORPO models as chat 💬
1
#758 opened 7 days ago
by
CombinHorizon
Models that used Nectar dataset
9
#749 opened 14 days ago
by
Stark2008
TRI-ML/mamba-7b-rw failed
9
#704 opened about 1 month ago
by
devingulliver
GPTQ and Mixtral models will need to be relaunched
6
#692 opened about 1 month ago
by
CombinHorizon
ALL Jamba models failing
16
#690 opened about 1 month ago
by
devingulliver
No good way to identify number of activated parameters causes MIxtral evaluation failures
28
#680 opened about 2 months ago
by
0-hero
Crowd-Source Hardware for the LeaderBoard?
4
#570 opened 4 months ago
by
ibivibiv
Eval models for data contamination?
2
#561 opened 4 months ago
by
liyucheng
Feature request: Run 100B + models automatically
15
#434 opened 6 months ago
by
ChuckMcSneed
Feature Request for Leaderboard: date added to hub
2
#425 opened 6 months ago
by
madmaxbr5
Feature request: Using weights hash to identify duplicates
1
#422 opened 6 months ago
by
mrfakename
Feature request: Add non AutoModelForCausalLM models
3
#391 opened 6 months ago
by
KnutJaegersberg
Tool: Adding evaluation results to model cards
46
#370 opened 7 months ago
by
Weyaxi
Feature suggestion: average of selected (rather than all) columns
4
#368 opened 7 months ago
by
Minus0
Tool: Open LLM Leaderboard Model Renamer
31
#310 opened 8 months ago
by
Weyaxi
Checking for toxicity too
9
#53 opened 12 months ago
by
ronald-d-rogers