Underwater U-Net Color Restore

A lightweight U-Net for underwater image and video color restoration. This is an experimental baseline trained only on synthetic underwater degradation, not a physically calibrated or paired real-world underwater dataset.

The model is useful for quick enhancement tests, preprocessing experiments, and learning-oriented image restoration work. It can improve strong blue/green/yellow casts, but it may overcorrect, invent color, or fail on scenes unlike the synthetic training distribution.

Model Files

  • model.safetensors: tensor-only U-Net weights for inference.
  • checkpoints/best.pt: full PyTorch training checkpoint with model, optimizer, scaler, epoch, and validation metadata.

Training summary:

  • Architecture: medium U-Net, RGB input/output, sigmoid output.
  • Synthetic source images: COCO val2017 subset, 500 clean images.
  • Synthetic pairs: 8 underwater variants per source image, 4,000 paired examples.
  • Training resolution: 320 px crops.
  • Best validation loss recorded in checkpoint: 0.05987411916255951.
  • Epoch: 35.

Install

git clone https://huggingface.co/sjdata/underwater-unet-color-restore
cd underwater-unet-color-restore
python3 -m venv .venv
source .venv/bin/activate
pip install --upgrade pip
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
pip install -r requirements.txt safetensors

For CPU-only inference, install PyTorch from the normal PyTorch instructions instead of the CUDA wheel command.

Image Inference

The original scripts load .pt checkpoints:

python scripts/infer_image.py \
  --checkpoint checkpoints/best.pt \
  --input path/to/underwater.jpg \
  --output restored.png \
  --model_size medium \
  --mode tile \
  --tile_size 320 \
  --overlap 80

For high-resolution photos, tiled mode usually preserves detail better than resizing the whole image to a square.

Safetensors Inference

python infer_safetensors_image.py \
  --weights model.safetensors \
  --input path/to/underwater.jpg \
  --output restored.png \
  --mode tile \
  --tile_size 320 \
  --overlap 80

Video Inference

python scripts/infer_video.py \
  --checkpoint checkpoints/best.pt \
  --input input.mp4 \
  --output restored_video.mp4 \
  --model_size medium \
  --mode tile \
  --tile_size 320 \
  --overlap 80

Examples

Example before/after grids are included in examples/:

  • examples/alphie_fireworks_before_after.png
  • examples/open_water_diver_before_after.png
  • examples/synthetic_underwater_before_after.png

Limitations

This model was trained on synthetic underwater effects. It does not know true underwater scene colors, depth, lighting, camera response, turbidity, or spectral attenuation. Treat outputs as visual enhancements rather than measurements.

Known failure modes:

  • color hallucination,
  • aggressive green/yellow correction,
  • weak recovery in very hazy scenes,
  • tiling artifacts if overlap is too small,
  • poor transfer to scenes outside the synthetic filter distribution.

Intended Use

Good fits:

  • local image/video enhancement experiments,
  • educational U-Net restoration baseline,
  • preprocessing before manual editing,
  • synthetic-to-real transfer experiments.

Not good fits:

  • scientific color reconstruction,
  • safety-critical marine analysis,
  • measuring true biological colors,
  • replacing calibrated underwater imaging workflows.

License

Code and model weights are released under the MIT License.

Downloads last month

-

Downloads are not tracked for this model. How to track
Safetensors
Model size
31.4M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support