⚡ WebGPU Benchmark Results (39.03x speedup) – M1 Max
#48
by
pcuenq
HF staff
- opened
Batch Size | WASM (int8) | WASM (fp16) | WASM (fp32) | WebGPU (int8) | WebGPU (fp16) | WebGPU (fp32) |
1 | 372.50 | 392.80 | 375.00 | 358.40 | 21.00 | 16.30 |
2 | 743.50 | 782.90 | 749.90 | 654.00 | 54.00 | 26.50 |
4 | 1482.00 | 1561.00 | 1510.40 | 1235.70 | 46.10 | 45.90 |
8 | 3009.80 | 3164.60 | 3049.50 | 2440.60 | 108.40 | 77.20 |
16 | 6134.60 | 6451.50 | 6127.90 | 4888.50 | 146.80 | 156.30 |
32 | 12237.00 | 13093.60 | 12447.60 | 10082.60 | 335.50 | 343.60 |
- Model: Xenova/all-MiniLM-L6-v2
- Tests run: WASM (int8), WASM (fp16), WASM (fp32), WebGPU (int8), WebGPU (fp16), WebGPU (fp32)
- Sequence length: 512
- Browser: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36
- GPU: vendor=apple, architecture=common-3, device=, description=