leaderboard / README.md
Jae-Won Chung
Benchmarking with Pegasus (#7)
4e9ddf9 unverified

ML.ENERGY Leaderboard

Leaderboard Deploy Apache-2.0 License

How much energy do LLMs consume?

This README focuses on explaining how to run the benchmark yourself. The actual leaderboard is here: https://ml.energy/leaderboard.

Setup

Model weights

  • For models that are directly accessible in Hugging Face Hub, you don't need to do anything.
  • For other models, convert them to Hugging Face format and put them in /data/leaderboard/weights/lmsys/vicuna-13B, for example. The last two path components (e.g., lmsys/vicuna-13B) are taken as the name of the model.

Docker container

$ git clone https://github.com/ml-energy/leaderboard.git
$ cd leaderboard
$ docker build -t ml-energy:latest .
# Replace /data/leaderboard with your data directory.
$ docker run -it \
    --name leaderboard \
    --gpus all \
    -v /data/leaderboard:/data/leaderboard \
    -v $(pwd):/workspace/leaderboard \
    ml-energy:latest bash

Running the benchmark

We run benchmarks using multiple nodes and GPUs using Pegasus. Take a look at pegasus/ for details.

You can still run benchmarks without Pegasus like this:

# Inside the container
$ cd /workspace/leaderboard
$ python scripts/benchmark.py --model-path /data/leaderboard/weights/lmsys/vicuna-13B --input-file sharegpt/sg_90k_part1_html_cleaned_lang_first_sampled.json
$ python scripts/benchmark.py --model-path databricks/dolly-v2-12b --input-file sharegpt/sg_90k_part1_html_cleaned_lang_first_sampled.json