TokForge — SD1.5 NPU "Fast" (DreamShaper-7, Qualcomm Hexagon)

SD1.5 image generation for the Qualcomm Hexagon NPU (HTP), packaged for on-device image generation in the TokForge Android app (dev.tokforge). This is the "Fast" tier — the quickest on-device image route (coherent 512×512 in ~9–16 s, no root).

The shipping checkpoint is DreamShaper-7 (Lykon/dreamshaper-7), a license-clean (CreativeML-OpenRAIL-M) SD1.5 finetune. It is a quality + composition upgrade over base Stable Diffusion 1.5: stronger aesthetics and better prompt-following on compositional prompts (e.g. "two robots playing chess" — base drops a robot, DreamShaper renders both). Same SD1.5 architecture and same temb NPU IO contract as base SD1.5 — only the UNet, the VAE-decoder, and the DreamShaper-specific time-embedding MLP weights differ; the CPU CLIP front-end is bit-identical to base SD1.5 and is reused.

The model is quantized to W8A16 (8-bit weights, 16-bit activations) and compiled to QNN HTP context binaries that run image generation directly on the phone's Hexagon DSP.

Based on

Lykon/dreamshaper-7 — itself an SD1.5 finetune of stable-diffusion-v1-5/stable-diffusion-v1-5.

Format

These are per-architecture QNN HTP context binaries, one set per Hexagon arch (V73, V75, V79, V81). They are not a portable format like GGUF — each binary is compiled for a specific Hexagon generation. The app reads the device's Hexagon arch and selects the matching set.

Binaries are forward-compatible: a set built for a lower Hexagon arch also runs on a higher-arch DSP, while native-arch sets are preferred for best performance.

The DreamShaper-7 "Fast" bins live under the dreamshaper7/ directory (with their own manifest.json); the app downloads from this variant. The repo also retains the original base SD1.5 ours-temb sets (top-level v73/…v81/ + root manifest.json) and the AI-Hub off-the-shelf SD1.5 sets for breadth.

File (per `dreamshaper7/<arch>/` dir)	Role
`unet.bin`	UNet HTP context binary (DreamShaper-7 W8A16)
`vae_decoder.bin`	VAE decoder HTP context binary
`text_encoder.bin`	CLIP text-encoder QNN binary
`time_mlp.bin`	host time-embedding weights (DreamShaper-specific)
`tokenizer.json`, `config.json`	tokenizer + pipeline config

The arch-independent CPU CLIP front-end (clip_sd15_base/clip_v2.mnn, token_emb.bin, pos_emb.bin) is shared by every arch and reused across the base + DreamShaper variants.

See dreamshaper7/manifest.json for the authoritative per-arch file set (with per-file size + md5) that the app uses to download the correct binaries for the device.

Usage

This bundle is loaded automatically by the TokForge Android app — it is not a standalone diffusers checkpoint. The app resolves the device Hexagon arch from the manifest, downloads the matching binaries, and runs them on the device NPU.

License & attribution

Released under CreativeML OpenRAIL-M, matching its base models.

This model is a derivative of Lykon/dreamshaper-7, itself a finetune of stable-diffusion-v1-5/stable-diffusion-v1-5. Please retain this attribution and observe the CreativeML OpenRAIL-M use restrictions.

Downloads last month: -; Downloads are not tracked for this model. How to track

Collection including darkmaniac7/TokForge-SD15-QNN-NPU

TokForge Image Models

Collection

On-device image generation for TokForge: Hexagon NPU (SD1.5 Fast, SDXL Faithful), CPU/GPU DreamShaper-LCM + 6 styles, RealVisXL photoreal. • 13 items • Updated about 5 hours ago