Proposal for new column

by Yuma42 - opened

I think it would be interesting to have an avg score / Co2 cost column. We already have both numbers so why not combine them to show the most efficient models?

If there was such a column, the leaderboard would currently look like this:

                                fullname Average ⬆️ CO₂ cost (kg) avg_per_kg_co2
                                    gpt2  5.977737    0.03924517      152.31776
              cpayne1303/cp2024-instruct  4.319731    0.03216190      134.31209
               cpayne1303/llama-43m-beta  5.288332    0.05839185       90.56627
               cpayne1303/llama-43m-beta  5.347100    0.05991588       89.24346
                      JackFram/llama-68m  4.862635    0.06055790       80.29728
                       cpayne1303/cp2024  3.614016    0.04761306       75.90388
                   openai-community/gpt2  6.510807    0.08594126       75.75881
                            sumink/ftgpt  3.951784    0.05281752       74.81957
                  cpayne1303/smallcp2024  3.455732    0.04730794       73.04760
            postbot/gpt2-medium-emailgen  4.743048    0.07818635       60.66338
          unsloth/Phi-3-mini-4k-instruct 27.178374    0.46953311       57.88383
              tiiuae/Falcon3-7B-Instruct 34.906699    0.61876067       56.41389
              tiiuae/Falcon3-3B-Instruct 26.551992    0.48046365       55.26327
        SultanR/SmolTulu-1.7b-Reinforced 15.756606    0.28961603       54.40516
             h2oai/h2o-danube3.1-4b-chat 16.210718    0.29914058       54.19097
                   openai-community/gpt2  6.296471    0.11738690       53.63862
 suayptalha/HomerCreativeAnvita-Mix-Qw7B 34.620978    0.64988069       53.27282
          newsbang/Homer-v0.3-Qwen2.5-7B 31.088203    0.58560348       53.08746
          newsbang/Homer-v0.4-Qwen2.5-7B 33.918837    0.63972041       53.02134
               icefog72/Ice0.37-18.11-RP 21.913941    0.41451281       52.86674

Sorted in ascending order, it would look like this:

                                fullname Average ⬆️ CO₂ cost (kg) avg_per_kg_co2
           WizardLMTeam/WizardLM-13B-V1.0  4.546092    70.9775871     0.06404968
          NAPS-ai/naps-gemma-2-27b-v0.1.0  1.679602    22.6642492     0.07410799
         NAPS-ai/naps-gemma-2-27b-v-0.1.0  1.679602    11.2248610     0.14963231
       NousResearch/Yarn-Llama-2-13b-128k  8.418618    51.9357833     0.16209668
                 PygmalionAI/pygmalion-6b  5.392360    31.9231193     0.16891707
            togethercomputer/GPT-JT-6B-v1  6.827354    37.9588107     0.17986218
                  TencentARC/LLaMA-Pro-8B  8.778934    47.8077336     0.18363000
                      Qwen/Qwen2-57B-A14B 25.033873   107.0314775     0.23389262
             mistralai/Mixtral-8x22B-v0.1 25.728348   104.6973163     0.24574028
                allknowingroger/Quen2-65B  3.531344    13.3174236     0.26516723
               alpindale/WizardLM-2-8x22B 32.983523    93.3052217     0.35350136
                   bigcode/starcoder2-15b 12.551764    35.0445477     0.35816594
                   teknium/OpenHermes-13B 12.169676    31.1191167     0.39106753
                   Qwen/Qwen1.5-110B-Chat 29.224837    72.5652931     0.40273849
                        Qwen/Qwen1.5-110B 29.846266    71.2708884     0.41877218
                         Qwen/Qwen1.5-32B 27.021817    59.9671594     0.45061026
        deepseek-ai/deepseek-llm-67b-chat 26.995929    59.8218087     0.45127236
                davidkim205/Rhea-72b-v0.5  4.224031     8.6886909     0.48615279
     mistral-community/mixtral-8x22B-v0.3 25.789407    52.4944852     0.49127840
          allknowingroger/Qwen2.5-42B-AGI  4.470830     8.8569811     0.50478030

Quite interesting, actually.

Open LLM Leaderboard org

Yep, @alozowski has been working on a blog to feature something like this! Lots of interesting results from computing CO2 cost :)
We'll probably add the column once the blog is ready but with holidays arriving it will take a couple weeks :)

Sign up or log in to comment