Spaces:
Running
on
CPU Upgrade
[Feature request] Add option for native quantization format (as default) for model submissions form
Why?
It might make it easier reduce clutter due to spurious duplicate submissions but with different non-native quantization formats
(also for float32, if the leader-board does not use it (some e.g. the medical leader-board does), I guess maybe choose bfloat16 since that would prevent over/under-flows from happening)
perhaps (maybe more complicated on the back-end, make it so that until a model has failed or successfully run, make it harder (maybe rate-limit-gating) to submit formats that aren't the native quant (but don't forget GPTQ compatibility), but this is more important for larger models that use up more compute time & resources
Hi!
We might consider adding something like this in the future!
When the leaderboard started, in a lot of cases, models stored where mostly the high precision versions, and we added a mechanism to convert the 4bit/8bit on the fly using bits and bytes to allow comparing quantized and original versions. However, now, a lot of models are stored quantized, so we could try to load this from the configuration.