tensorbend / README.md
Ex0bit's picture
Deploy TensorBend — browser-based LLM inference via WebGPU
21a8eeb
metadata
title: TensorBend
emoji: 
colorFrom: gray
colorTo: green
sdk: static
pinned: false

TensorBend

Run LLMs entirely in your browser. Weights are fetched as raw SafeTensors from HuggingFace and loaded directly into WebGPU compute buffers. No ONNX, no server.

Requires: Chrome/Edge with WebGPU support (macOS, Windows, ChromeOS). Apple Silicon recommended for best performance.

Supported models: Qwen3.5 family (0.8B, 2B, 4B, 9B) — INT4 quantized via AutoRound.