LUCID-CC0 v2 High Complexity (256Γ256): Finetuning Dataset for SISR
A high-complexity subset of lucid-cc0-v2 containing only the most detailed, information-rich tiles. Designed for finetuning pretrained SISR models to push final quality metrics.
Overview
| Property | Value |
|---|---|
| Source | lucid-cc0-v2 (filtered subset) |
| Filtering | ICNet complexity β₯ 0.85 (highest-detail tiles only) |
| Tile size | 256Γ256 pixels |
| Mean complexity | ~0.917 |
| Total tiles | ~271,000 |
| Disk size | ~35 GB |
| License | CC0-1.0 (public domain) |
Intended Use
This dataset is designed for finetuning SISR models that have been pretrained on the full lucid-cc0-v2 dataset. The high-complexity tiles contain the sharpest edges, finest textures, and most intricate details β exactly what a model needs to refine its super-resolution capabilities.
Recommended Training Strategy
| Stage | Dataset | Purpose |
|---|---|---|
| 1. Pretrain from scratch | lucid-cc0-v2 (200GB) | Learn general image representations |
| 2. Finetune | This dataset (27GB) | Refine on highest-quality tiles |
| 3. Finetune-finetune | lucid-cc0-v2-hc-512 (512Γ512) | Push quality with max patch size |
Why High Complexity?
- Transformer models (HAT, SwinIR) are data-hungry but also benefit from quality-focused finetuning
- High-complexity tiles contain more high-frequency information per pixel
- Finetuning on these tiles specifically improves texture recovery and edge sharpness
- The 256Γ256 patch size is compatible with most SISR training frameworks
Dataset Structure
lucid-cc0-v2-hc/
βββ train/
β βββ 000/
β β βββ 00000.png
β β βββ ...
β βββ 001/
β βββ ...
βββ LR/
β βββ x2/ # Bicubic downscaled Γ2
β βββ x4/ # Bicubic downscaled Γ4
βββ metrics.csv # Per-image complexity statistics
Filtering Criteria
Tiles were selected from lucid-cc0-v2 based on ICNet complexity score:
- Threshold: β₯ 0.85 (out of 1.0)
- Distribution: Mean 0.917, all tiles above 0.85
- What this captures: Images with strong edges, fine textures, intricate patterns, high local contrast
- What this excludes: Smooth gradients, blurry regions, low-detail areas, sky/water surfaces
Bicubic Downscaling
LR images are provided using MATLAB-compatible bicubic interpolation, matching the standard used in SISR benchmarks.
Scale factors: Γ2 and Γ4.
Citation
@dataset{lucid_cc0_v2_hc,
title={LUCID-CC0 v2 High Complexity: Finetuning Dataset for SISR},
author={Phips},
year={2026},
license={CC0-1.0},
url={https://huggingface.co/datasets/Phips/lucid-cc0-v2-hc}
}
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support