Text Super-Resolution & Restoration GGUF Models

Lightweight super-resolution and image restoration models converted to GGUF for CrispEmbed OCR preprocessing.

Models

File Architecture Params Scale Size License Paper
tbsrn-telescope-f16.gguf TBSRN (text-line SR) 1.13M 2x 2.2 MB Apache-2.0 CVPR 2021
pan-x4-f16.gguf PAN (pixel attention) 272K 4x 0.5 MB Apache-2.0 ECCV 2020W
hat-sr-x4-f16.gguf HAT (hybrid attention transformer) 21M 4x 40 MB MIT CVPR 2023
dat-light-x2-f16.gguf DAT-light (dual aggregation transformer) 830K 2x 38 MB Apache-2.0 ICCV 2023
restormer-denoise-f16.gguf Restormer (denoising) 26M 1x 50 MB Apache-2.0 CVPR 2022

TBSRN Telescope (text-line SR)

  • Task: Enhance individual detected text lines before recognition
  • Input: Text-line crop resized to 16x64 -> Output: 32x128 (2x)
  • Source: PaddleOCR sr_telescope (Apache-2.0)

PAN (whole-image 4x SR)

  • Task: Upscale full document pages (rescues 75dpi text)
  • Input: Any RGB image (tiled) -> Output: 4x upscale
  • Source: PaddleGAN pan_x4 (Apache-2.0)

HAT (hybrid attention transformer, 4x SR)

  • Task: High-quality 4x upscaling (CVPR 2023 SOTA on multiple SR benchmarks)
  • Input: Any RGB image (tiled) -> Output: 4x upscale
  • Architecture: Swin Transformer + overlapping cross-attention + channel attention
  • Source: XPixelGroup/HAT (MIT)

DAT-light (dual aggregation transformer, 2x SR)

  • Task: High-quality 2x upscaling with dual spatial+channel attention
  • Input: Any RGB image (tiled) -> Output: 2x upscale
  • Architecture: Split-channel windowed spatial attention + L2-normalized transposed channel attention + AIM + SGFN
  • Source: zhengchen1999/DAT (Apache-2.0)

Restormer (image denoising/restoration)

  • Task: Remove noise from document scans
  • Input: Any RGB image -> Output: Denoised (same size)
  • Architecture: Multi-Dconv head transposed attention, U-Net encoder-decoder
  • Source: swz30/Restormer (Apache-2.0)

Parity Verification

All models pass the CrispEmbed diff harness (Python reference vs C++ engine):

Model cos_sim Status
TBSRN 0.999985 PASS
PAN 0.999654 PASS
HAT 0.999990 PASS
DAT-light 0.999956 PASS
Restormer 1.000000 PASS

Usage with CrispEmbed

from crispembed import CrispPanSr, CrispDatSr

# PAN: 4x upscale
sr = CrispPanSr("pan-x4-f16.gguf")
out, ow, oh = sr.process(pixels, width, height)

# DAT: 2x upscale (higher quality)
sr = CrispDatSr("dat-light-x2-f16.gguf")
out, ow, oh = sr.process(pixels, width, height)
# CLI
crispembed --pan-model pan-x4-f16.gguf --pan-sr input.png > output.ppm
crispembed --dat-model dat-light-x2-f16.gguf --dat-sr input.png > output.ppm

License

Apache-2.0 for all models except HAT (MIT). Both licenses are permissive.

Downloads last month
272
GGUF
Model size
10.2M params
Architecture
dat
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support