Instructions to use SceneWorks/lens-mlx with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MLX
How to use SceneWorks/lens-mlx with MLX:
# Download the model from the Hub pip install huggingface_hub[hf_xet] huggingface-cli download --local-dir lens-mlx SceneWorks/lens-mlx
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- LM Studio
Lens (base) β MLX pre-quantized tiers (SceneWorks)
Native-MLX, pre-quantized re-host of the base Lens model (microsoft/Lens, MIT)
for on-device Apple-Silicon inference via mlx-gen's
mlx-gen-lens provider (SceneWorks). The heavy components are packed offline so a tier loads
directly with no dense transient and no in-app quantization (epic 8506, sc-8767).
Microsoft removed microsoft/Lens from the Hub; the base DiT here was recovered from the public
ungated re-package Comfy-Org/Lens
(diffusion_models/lens_bf16.safetensors), whose keys are byte-identical to the diffusers
LensTransformer2DModel state dict. Base Lens and Lens-Turbo differ only in the DiT weights;
this re-host reuses the shared gpt-oss-20b text encoder + Flux.2 VAE + tokenizer + scheduler
from SceneWorks/lens-turbo-mlx.
Base Lens is undistilled β use a higher step count (~20β26) with CFG ~5.0 (the mlx-gen-lens
lens id defaults to 20 steps / CFG 5.0), unlike the distilled Turbo (4 steps / guidance 1.0).
Tiers
Each subdirectory is a full, self-contained turnkey snapshot (the diffusers multi-component tree β
transformer/, text_encoder/, vae/, tokenizer/, scheduler/, model_index.json):
| Tier | Dir | What is packed |
|---|---|---|
| Q4 (default) | q4/ |
DiT + gpt-oss encoder MoE experts β MLX group-64 affine 4-bit |
| Q8 | q8/ |
DiT + gpt-oss encoder MoE experts β MLX group-64 affine 8-bit |
| bf16 | bf16/ |
dense mirror of the source (no quantization) |
Two components are quantized (matching the load-time .quantize scope):
- DiT β
img_in/txt_in/proj_out+ every block's fused-QKV attention projections (img_qkv/txt_qkv/to_out.0/to_add_out) and SwiGLU MLPs. The timestep embedder, AdaLN modulations, and all norms stay full precision. - gpt-oss-20b encoder MoE experts β the source ships these as MXFP4; the packed tiers store them
as MLX group-64 affine Q4/Q8 (stacked
experts.{gate_up,down}_proj.{weight,scales,biases}). The router / attention / embeddings / norms stay dense.
The VAE (the shared Flux.2 decoder) always runs f32 and is shipped dense in every tier.
The pack is byte-identical to what the load-time quantizer produces (bf16 cast, group 64), verified
in-repo (mlx-gen-lens convert/quant byte-identity tests) and by an on-device render gate.
License
MIT, inherited from microsoft/Lens. The shared text encoder is openai/gpt-oss-20b (Apache-2.0)
and the VAE is black-forest-labs/FLUX.2-dev (Apache-2.0). This is a format re-host; all model
weights and credit belong to the original authors (Microsoft Research; OpenAI; Black Forest Labs).
Quantized
Model tree for SceneWorks/lens-mlx
Base model
microsoft/Lens