What about inference speed ?

#35
by Pitchboy - opened

I think it could be interesting to have the tokens/sec on different hardwares: CPU / GPU (A10, T4...)

It would be very useful, a good measure as to the usability of a model for more intense workloads.

Open LLM Leaderboard org

Hi! A new leaderboard was introduced just for this here!

clefourrier changed discussion status to closed

Sign up or log in to comment