⚡ WebGPU Benchmark Results (183.84x speedup)
#50
by
omaryshchenko
- opened
Batch Size | WASM (int8) | WASM (fp16) | WASM (fp32) | WebGPU (int8) | WebGPU (fp16) | WebGPU (fp32) |
1 | 298.30 | 481.00 | 460.60 | 340.80 | 33.50 | 52.10 |
2 | 609.80 | 978.90 | 928.30 | 679.30 | 88.70 | 115.90 |
4 | 1209.90 | 1873.70 | 1763.80 | 1257.70 | 124.00 | 34.30 |
8 | 2425.40 | 3794.40 | 3536.90 | 2285.10 | 152.10 | 190.00 |
16 | 4977.40 | 7933.20 | 7318.60 | 3498.40 | 207.40 | 88.00 |
32 | 10469.90 | 19541.80 | 15913.80 | 7471.70 | 370.30 | 106.30 |
- Model: Xenova/all-MiniLM-L6-v2
- Tests run: WASM (int8), WASM (fp16), WASM (fp32), WebGPU (int8), WebGPU (fp16), WebGPU (fp32)
- Sequence length: 512
- Browser: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36
- GPU: vendor=nvidia, architecture=ampere, device=, description=