Crowd-Source Hardware for the LeaderBoard?

#570
by ibivibiv - opened

We all use this service and I read that there isn't sufficient A100 space to evaluate the really big models. I'd chip in for @clefourrier H4 to get more hardware and I certainly can't be alone. I appreciate the resource and would gladly throw in to give it more hardware. Would it be possible? Anyone else interested? It would probably open up more space for people to run more models through.

Open LLM Leaderboard org

Hi @ibivibiv ,
That's super kind of you! We might add an option for people to pay for their own eval compute using inference endpoints if they can, but it's a bit of engineering work and mostly something we'll do in Q2.

No problem @clefourrier . Hey, I rent time on runpod for their giant piles of A100 etc for training and I know it isn't cheap. It was an idea I had to help "feed the need". I'm a huge fan and love what Huggingface is and does. I'm at IBM and when I heard that we were partnering I was doing back flips. We don't always do "cool things" over here but we did this time! I like the idea that there could be a "pay for" eval service. ESPECIALLY for larger models (of which I am very interested in). It seems unfair that the eval of larger models would be "Free". I would think that if you were running evals on a 34b or less model that it would make sense for that to be a limited open service for the community. But if you are stretching to 70b and above? That is not a cheap operation you are performing and it takes so much more time and consumes so many resources. It is almost "rude" to run it through the same pipeline as someone with no resources just trying to get their first 7B model evaluated. If you were in the business of large models, then you obvious have resources to back up what you are doing so it seems intuitive that people in that position would accept that an evaluation test suite and the results being made available in a very well known public forum would have a cost associated. I'm funding my own work out of my pocket but I wouldn't consider paying for evaluation of my larger models as something out of the ordinary. Again, I feel like I'm being almost rude when I send one in to be evaluated. It is just such a great and easy to use service that there is no better way out there to get that great report that gets produced. Let me know if you need any help internally with that pay for option. I'd gladly hop on a call and talk. I am sure given the relationship between IBM and Huggingface there is an NDA somewhere. Cheers.

I would happily pay a sub to expedite evals

No idea if this is possible but couldn't there be some sort of "FoldingAtHome" or "LeelaChess" or "SETI@home" or "StableHorde" (you get the idea I'll stop with the examples now) kind of setup where a master server hands out small evaluation tasks to users that volunteer their GPU cycles? The master server could ofcourse take into account the hardware available to match certain tasks to the available donated hardware.

I'm saying this like it's easy to do and I realise that it probably isn't, I'm just not quite sure just how hard it really is lol.
Because if it's doable it sounds like a kind of nice way to get a lot of people to volunteer some of their hardware to you.

You could even implement a reward system where people collect points for hardware and time donated that they could for example later on use to get a model they want evaluated to be put at the front of the queue or something. Might be nice for when you want a model evaluated TODAY, but usually don't mind donating some of your hardware to others.

Sign up or log in to comment