FlowSR β Fast Image Super-Resolution via Consistency Rectified Flow
This repository hosts the reproduced model checkpoint for FlowSR, a single-step real-world image super-resolution model based on the ICCV 2025 paper "Fast Image Super-Resolution via Consistency Rectified Flow."
FlowSR reformulates super-resolution as a rectified flow that bridges low-resolution (LR) and high-resolution (HR) images, and uses HR-regularized consistency learning with a fastβslow time scheduling strategy to deliver high-quality results in as few as one inference step.
- π Paper (ICCV 2025): openaccess.thecvf.com
- π arXiv: arxiv.org/abs/2605.12377
- π» Inference code: github.com/springXIACJ/FlowSR (unofficial third-party implementation)
Files
flowsr.safetensorsβ the model checkpoint. It stores LoRA adapter weights (rank 32) for the UNet on top of a Stable Diffusion 2.1-base backbone, together with the FlowSR-specific metadata needed to rebuild the adapters at load time.
How to use
The checkpoint is consumed by the FlowSR inference package. Download the weights into a local checkpoints/ directory:
pip install -U huggingface_hub
hf download chunjie-spring/FlowSR flowsr.safetensors --local-dir checkpoints
Then run single-image or folder inference (see the inference repository for full setup):
python -m flowsr.infer \
--input path/to/lr.png \
--output outputs \
--checkpoint checkpoints/flowsr.safetensors
Hardware: the model targets a CUDA GPU. A single image runs in roughly 0.14 s at 4Γ upscaling to a 512Γ512 resolution on a modern GPU.
Model details
- Backbone: Stable Diffusion 2.1-base (
Manojb/stable-diffusion-2-1-base, a re-upload of the originalstabilityai/stable-diffusion-2-1-baseweights, which were removed from the Hub). - Scheduler:
FlowMatchEulerDiscreteScheduler(rectified flow). - Adapters: PEFT LoRA, rank 32, injected into the UNet.
- Default inference: 1 step, scale Γ4,
guidance_scale = 1.0, wavelet color correction. - Training data: LSDIR + the first 10K FFHQ face images, with LRβHR pairs synthesized via the Real-ESRGAN degradation pipeline; image-quality captions generated with Qwen2-VL.
Evaluation
Quantitative comparison on RealSR and DRealSR (StableSR real-world test sets). FlowSR runs in a single step:
| Dataset | Steps | PSNR β | SSIM β | LPIPS β | DISTS β | FID β | NIQE β | MUSIQ β | MANIQA β | CLIPIQA β |
|---|---|---|---|---|---|---|---|---|---|---|
| RealSR | 1 | 25.54 | 0.7434 | 0.2728 | 0.2013 | 112.60 | 5.28 | 69.22 | 0.6486 | 0.6701 |
| DRealSR | 1 | 28.50 | 0.7859 | 0.2975 | 0.2115 | 130.30 | 6.13 | 65.46 | 0.6172 | 0.7074 |
Metrics follow common SR conventions (PSNR/SSIM on the Y channel in YCbCr). Evaluation test sets: Iceclear/StableSR-TestSets.
Limitations
- Trained for 4Γ real-world super-resolution; other scales/degradations are out of distribution.
- Requires a GPU; CPU inference is not a supported path.
License
This checkpoint is released under the PolyForm Noncommercial License 1.0.0 for non-commercial research use. For commercial use, please contact the authors.
Citation
If you find FlowSR useful, please cite the paper:
@inproceedings{xu2025fast,
title={Fast Image Super-Resolution via Consistency Rectified Flow},
author={Xu, Jiaqi and Li, Wenbo and Sun, Haoze and Li, Fan and Wang, Zhixin and Peng, Long and Ren, Jingjing and Yang, Haoran and Hu, Xiaowei and Pei, Renjing and Heng, Pheng-Ann},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
pages={11755--11765},
year={2025}
}
Model tree for chunjie-spring/FlowSR
Base model
Manojb/stable-diffusion-2-1-base