SwinIR β€” Image Restoration (ONNX)

ONNX exports of SwinIR β€” Swin Transformer for Image Restoration. Two variants covering the two most common SwinIR use cases: real-world super-resolution and color denoising.

Re-exported from upstream PyTorch weights. Provenance trail: Liang et al. β†’ JingyunLiang/SwinIR (cloned source) + pinned .pth checkpoints from the v0.0 GitHub release β†’ torch.onnx.export (one pass per variant) β†’ these files.

Toolchain: torch 2.4.x (CUDA 12.4), timm latest, onnx latest, onnxruntime>=1.17, opset 17, do_constant_folding=True, dynamo=False (forces the legacy TorchScript-based exporter; SwinIR's .type_as() buffer coercions trip the dynamo path's name-lineage tracking on torch >=2.5). Full conversion script: scripts/export-swinir.ps1 in the DatumIngest repo.

Credit: Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, Radu Timofte (ETH Zurich and collaborators). Paper: "SwinIR: Image Restoration Using Swin Transformer", ICCV 2021.

What this repo contains

File Variant Input β†’ Output Use
swinir_realsr_x4.onnx SwinIR-L real-SR (4Γ—) 64Γ—64 RGB β†’ 256Γ—256 RGB Real-world image super-resolution (handles compression artifacts, sensor noise, mild blur as a side effect). ~110 MB.
swinir_denoising_color_25.onnx SwinIR-M color DN 128Γ—128 RGB β†’ 128Γ—128 RGB Color denoising at Gaussian noise Οƒ=25 β€” the standard denoising-benchmark reference. ~45 MB.

Both files share the same general I/O signature (NCHW float32 RGB in [0, 1]) β€” only the spatial dims differ.

Input / output

swinir_realsr_x4.onnx swinir_denoising_color_25.onnx
Input name image image
Input shape [batch, 3, 64, 64] [batch, 3, 128, 128]
Input dtype float32 float32
Input range [0, 1] RGB [0, 1] RGB
Output name upscaled denoised
Output shape [batch, 3, 256, 256] [batch, 3, 128, 128]
Dynamic axes batch only batch only

Spatial dims are fixed by design β€” SwinIR's windowed attention is brittle under dynamic H/W in ONNX Runtime's window-shift op. To process larger images, tile the input into 64Γ—64 (SR) or 128Γ—128 (DN) patches with some overlap, run inference per tile, and stitch the outputs.

How to use

import onnxruntime as ort
import numpy as np
from PIL import Image

# Pick the variant
sess = ort.InferenceSession("swinir_denoising_color_25.onnx")
# or:
# sess = ort.InferenceSession("swinir_realsr_x4.onnx")

img = Image.open("noisy.jpg").convert("RGB").resize((128, 128))
arr = np.asarray(img, dtype=np.float32) / 255.0          # HWC, [0,1]
arr = arr.transpose(2, 0, 1)[None, ...]                  # 1x3xHxW

result = sess.run(None, {"image": arr.astype(np.float32)})[0][0]
result = np.clip(result, 0.0, 1.0).transpose(1, 2, 0)    # back to HWC
result_img = Image.fromarray((result * 255).astype(np.uint8))

For larger images, see the upstream main_test_swinir.py for a reference tiling implementation.

Which one should I use?

  • swinir_denoising_color_25.onnx β€” when you specifically want the Gaussian Οƒ=25 reference denoiser (research papers, benchmark reproduction, comparing against other denoisers).
  • swinir_realsr_x4.onnx β€” when you want 4Γ— super-resolution on real-world photos and don't mind that it'll also clean up some noise / compression artifacts in the process.

For blind real-world denoising (unknown noise level), SCUNet is the better fit β€” SwinIR's noise25 variant is trained for a specific noise level and degrades when the input noise pattern differs.

License

Apache-2.0 β€” same as the upstream JingyunLiang/SwinIR repo. LICENSE file included.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support