Instructions to use litert-community/NAFNet-GoPro-width32-LiteRT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- LiteRT
How to use litert-community/NAFNet-GoPro-width32-LiteRT with LiteRT:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
NAFNet-GoPro-width32 β LiteRT (on-device image deblur, fully-GPU)
NAFNet (Nonlinear Activation Free Network, ECCV 2022) image
restoration, converted to LiteRT and running fully on the CompiledModel GPU (ML Drift) on Android.
NAFNet is a U-Net of NAFBlocks with no activation functions at all (SimpleGate = channel-split
multiply), so the whole network is a clean CNN on the GPU delegate. This is the GoPro-width32 variant β
motion deblur.
On-device (Pixel 8a, Tensor G3 β verified)
| nodes on GPU | 2179 / 2179 LITERT_CL (full residency) |
| inference | ~42 ms (256Γ256) |
| size | 38 MB (fp16) |
| accuracy | device output == PyTorch (corr 1.000000) β re-authoring is numerically exact |
image[1,3,256,256] (RGB [0,1]) β[GPU: NAFNet U-Net]β restored[1,3,256,256]
Usage (Android, LiteRT CompiledModel)
val model = CompiledModel.create(modelPath, CompiledModel.Options(Accelerator.GPU), null)
val input = model.createInputBuffers(); val output = model.createOutputBuffers()
input[0].writeFloat(chw) // [1,3,256,256] RGB in [0,1], NCHW
model.run(input, output)
val restored = output[0].readFloat() // [1,3,256,256] in [0,1]
A complete Android sample (image picker + before/after) is in the official
google-ai-edge/litert-samples repo under
compiled_model_api/image_restoration.
How it converts (litert-torch)
NAFNet is fully convolutional (any size that is a multiple of 16; exported here at 256Γ256). Three numerically-exact GPU re-authorings:
LayerNorm2dβ fp16-safe channel LayerNorm. NAFNet's residual stream grows large (|x|β175 at the bottleneck), so the LayerNorm channel reductionsΞ£_c xandΞ£_c (xβΞΌ)Β²(~15M) overflow fp16 (max 65504) on the Mali delegate (which computes in fp16 regardless of the model dtype) β a grid artifact. Doing the reductions in a down-scaledx/Sdomain (S=128) and rescaling is numerically exact and fp16-safe.- Simplified Channel Attention
AdaptiveAvgPool2d(1)βmean(3).mean(2)(two single-axis means). - Upsample
Conv2d(1Γ1)+PixelShuffle(2)β Conv2d + depth-to-spaceZeroStuffConvT2d.
Result: banned ops NONE, all tensors β€4D, tflite-vs-torch corr 1.0, device-vs-torch corr 1.0.
License
MIT. Upstream: megvii-research/NAFNet. Original weights: NAFNet-GoPro-width32 from the official release.
- Downloads last month
- -
