ohjuny's picture
TGI/vLLM benchmarking (#34)
c5e73ca unverified

A newer version of the Gradio SDK is available: 4.37.1

Upgrade

About

This directory contains a script for running benchmarks (including energy comsumption) on models that are hosted on a dedicated inference server. The script is taken and modified from vllm

The current script supports TGI and vLLM. Before running the benchmark script, the inference server hosting the relevant model should be hosted.