Spaces:

xray918
/

my_gradio

Runtime error

Upload folder using huggingface_hub

0ad74ed verified 2 months ago

1.38 kB

	# Load Test

	This folder contains the content to load test gradio apps, and check the ability to handle multiple connections simultaneously.


	## Setup

	For a proper "production" environment load test, you can run gradio behind an nginx config. Install nginx and add `nginx.conf` to `/etc/nginx/conf.d/*.conf` on the machine running gradio. (You may want to have a machine running gradio that is separate from the machine running the load test, to include the effect of network latency).

	`gradio.py` contains a simple gradio chat app that streams 500 tokens at a rate of 100 tokens/sec. This app is compatible with gradio 3.x as well.

	`simple.py` contains a simple fastapi that streams 500 tokens at a rate of 100 tokens/sec, with both a WS and SSE endpoint. This does not use gradio. The purpose of this file is to compare the performance of streaming raw websockets and SSE, vs with gradio overhead.

	`workers.py` contains a fastapi that uses queues and worker threads to stream 500 tokens at a rate of 100 tokens/sec, with both a WS and SSE endpoint. The purpose of this file is to compare the performance of streaming websockets and SSE with the implementation of gradio but without all the overhead.


	`load.ipynb` supports running load tests on `chat.py` with gradio 3.x and 4.0, as well as on `app.py`. Simply configure the URL to point to where the app is running.