YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

πŸ–‹οΈ Calligraphy Repair cGAN

A two-stage system for repairing damaged handwriting and calligraphy using combined pathfinding + conditional GAN approach.

Stage 1 uses a deterministic A*/Bezier pathfinding algorithm to repair large structural gaps in strokes. Stage 2 uses a conditional GAN to re-apply the artistic style, producing visually coherent results.

Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    INPUT: Damaged Calligraphy                β”‚
β”‚                       (RGB, HΓ—WΓ—3)                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚      STAGE 1: PATHFINDING     β”‚
         β”‚   (Deterministic Algorithm)    β”‚
         β”‚                               β”‚
         β”‚  1. Binarize β†’ ink mask       β”‚
         β”‚  2. Skeletonize (Zhang-Suen)  β”‚
         β”‚  3. Find gap endpoints        β”‚
         β”‚  4. A*/Bezier gap bridging    β”‚
         β”‚  5. Variable-width rendering  β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                 β”‚           β”‚
         stroke_mask    gap_mask
           (1,H,W)     (1,H,W)
                 β”‚           β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚      STAGE 2: cGAN REFINEMENT β”‚
         β”‚    (Learned Style Transfer)    β”‚
         β”‚                               β”‚
         β”‚  Generator Input (5ch):       β”‚
         β”‚  [damaged_RGB(3) +            β”‚
         β”‚   stroke_mask(1) +            β”‚
         β”‚   gap_mask(1)]                β”‚
         β”‚                               β”‚
         β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
         β”‚  β”‚   U-Net Generator       β”‚  β”‚
         β”‚  β”‚   + Gated Convolutions  β”‚  β”‚
         β”‚  β”‚   + 8 Dilated ResBlocks β”‚  β”‚
         β”‚  β”‚   + Skip Connections    β”‚  β”‚
         β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
         β”‚                               β”‚
         β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
         β”‚  β”‚  70Γ—70 SN-PatchGAN      β”‚  β”‚
         β”‚  β”‚  Discriminator          β”‚  β”‚
         β”‚  β”‚  (Spectral Normalized)  β”‚  β”‚
         β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                         β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚    OUTPUT: Repaired Image      β”‚
         β”‚        (RGB, HΓ—WΓ—3)           β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key Features

  • πŸ” Pathfinding Stage: A*/Bezier curve gap bridging with direction-aware endpoint matching, variable stroke thickness estimation, and smooth calligraphic curve generation
  • 🎨 cGAN Stage: EdgeConnect-inspired architecture with gated convolutions, dilated residual blocks, and multi-loss training (adversarial + L1 + perceptual VGG + style Gram + feature matching)
  • πŸ“Š Synthetic Data Pipeline: Generates training pairs from clean calligraphy with realistic damage (erosion, gaps, fading, bleeding, stains, scratches)
  • ⚑ Flexible: Works with any writing system (Latin, Chinese, Arabic, Japanese, etc.)

Installation

pip install -r requirements.txt

Quick Start

1. Generate Training Data

# Generate 5000 synthetic training pairs
python damage_generator.py \
    --output_dir data \
    --num_train 5000 \
    --num_val 500 \
    --image_size 256

# Or use your own clean calligraphy images:
python damage_generator.py \
    --output_dir data \
    --source_dir /path/to/clean/calligraphy/ \
    --num_train 5000

This creates:

data/
β”œβ”€β”€ train/
β”‚   β”œβ”€β”€ clean/      # Ground truth images
β”‚   β”œβ”€β”€ damaged/    # Synthetically damaged images
β”‚   └── mask/       # Damage location masks
└── val/
    β”œβ”€β”€ clean/
    β”œβ”€β”€ damaged/
    └── mask/

2. Train the Model

# Full training (recommended)
python train.py \
    --data_dir data \
    --epochs 200 \
    --batch_size 8 \
    --image_size 256 \
    --generator_type unet \
    --gan_type lsgan \
    --lr_g 1e-4 \
    --lr_d 1e-4

# Quick test run
python train.py \
    --data_dir data \
    --epochs 5 \
    --batch_size 4 \
    --generate_data \
    --num_train 100 \
    --num_val 20

# With on-the-fly damage generation (only needs clean/ directory)
python train.py \
    --data_dir data \
    --on_the_fly_damage \
    --epochs 200

# Resume from checkpoint
python train.py \
    --data_dir data \
    --resume checkpoints/best_model.pth \
    --epochs 300

3. Repair Damaged Images

# With trained GAN (best quality)
python inference.py \
    --input damaged_calligraphy.png \
    --output repaired.png \
    --checkpoint checkpoints/generator_best.pth

# Pathfinding only (no GAN needed, instant)
python inference.py \
    --input damaged_calligraphy.png \
    --output repaired.png \
    --pathfinding_only

# Batch repair a directory
python inference.py \
    --input_dir damaged_images/ \
    --output_dir repaired_images/ \
    --checkpoint checkpoints/generator_best.pth

# Save all intermediate stages for visualization
python inference.py \
    --input damaged_calligraphy.png \
    --output repaired.png \
    --checkpoint checkpoints/generator_best.pth \
    --save_stages

Training Recipe

Based on literature from EdgeConnect, pix2pix, DE-GAN, and pix2pixHD:

Parameter Value Source
Optimizer Adam(Ξ²1=0.0, Ξ²2=0.9) EdgeConnect
Learning Rate 1e-4 (G and D) EdgeConnect
LR Schedule Constant first half, linear decay second half pix2pix
GAN Type LSGAN (MSE loss) pix2pixHD
Ξ»_adversarial 1.0 EdgeConnect
Ξ»_L1 100.0 pix2pix
Ξ»_feature_matching 10.0 pix2pixHD
Ξ»_perceptual 0.1 EdgeConnect
Ξ»_style 250.0 EdgeConnect
Ξ»_masked_L1 50.0 DiffHDR
Image Size 256Γ—256 Standard
Batch Size 8 EdgeConnect
Epochs 200 pix2pix
Weight Init Gaussian(0, 0.02) pix2pix

Loss Functions

The combined generator loss:

L_G = Ξ»_adv Β· L_adversarial          (fool the discriminator)
    + Ξ»_L1 Β· L_L1                     (pixel-level reconstruction)
    + Ξ»_FM Β· L_feature_matching        (stabilize training via D features)
    + Ξ»_perc Β· L_perceptual            (VGG relu1_2, relu2_2, relu3_3, relu4_3)
    + Ξ»_style Β· L_style                (Gram matrices for texture matching)
    + Ξ»_mask Β· L_masked_L1             (focus on damaged regions)

The discriminator loss:

L_D = L_adversarial_D + Ξ»_R1 Β· L_R1_regularization

Architecture Details

Generator: U-Net with Dilated Residual Bottleneck

Encoder:
  [GatedConv 5β†’64, 7Γ—7, stride 1]          # Level 0
  [GatedConv 64β†’128, 4Γ—4, stride 2]         # Level 1 (↓2Γ—)
  [GatedConv 128β†’256, 4Γ—4, stride 2]        # Level 2 (↓2Γ—)

Bottleneck:
  8Γ— [DilatedResBlock 256β†’256, dilation=2]   # ~200px receptive field

Decoder:
  [ConvTranspose 512β†’128, 4Γ—4, stride 2]    # Skip from Level 2 (↑2Γ—)
  [ConvTranspose 256β†’64, 4Γ—4, stride 2]     # Skip from Level 1 (↑2Γ—)
  [Conv 128β†’3, 7Γ—7, Tanh]                   # Skip from Level 0

All layers: InstanceNorm + ReLU (encoder: LeakyReLU)
Input: 5 channels (damaged_RGB + stroke_mask + gap_mask)
Output: 3 channels (repaired RGB in [-1, 1])

Discriminator: 70Γ—70 SN-PatchGAN

C64(no norm) → C128 → C256 → C512 → Conv→1
All convs: 4Γ—4, spectral normalized
InstanceNorm on layers 2-4
LeakyReLU(0.2) throughout
Input: 8 channels (damaged + output + masks)
Output: H/16 Γ— W/16 patch predictions

Pathfinding Algorithm

The deterministic Stage 1 pipeline:

  1. Binarization: Otsu's adaptive thresholding + morphological cleanup
  2. Skeletonization: Zhang-Suen thinning to 1-pixel-wide skeleton
  3. Thickness Estimation: Distance transform to measure local stroke width
  4. Endpoint Detection: Find degree-1 nodes (gap openings) via 3Γ—3 convolution kernel
  5. Direction Analysis: Trace back along skeleton to compute tangent direction at each endpoint
  6. Endpoint Matching: Score pairs by distance + direction alignment + collinearity; greedy matching
  7. Gap Bridging:
    • A* mode: Cost = distance + direction_continuity + curvature_penalty + ink_proximity
    • Bezier mode: Cubic Bezier with control points guided by endpoint tangents
  8. Stroke Rendering: Variable-width circular brush matching estimated local thickness

Project Structure

β”œβ”€β”€ pathfinding.py          # Stage 1: Deterministic gap repair
β”œβ”€β”€ damage_generator.py     # Synthetic training data generation
β”œβ”€β”€ models.py               # cGAN architecture (Generator + Discriminator)
β”œβ”€β”€ losses.py               # Loss functions + metrics
β”œβ”€β”€ dataset.py              # Dataset loader with pathfinding integration
β”œβ”€β”€ train.py                # Training pipeline
β”œβ”€β”€ inference.py            # Inference / repair script
β”œβ”€β”€ requirements.txt        # Dependencies
└── README.md               # This file

Using Your Own Data

Option A: Clean calligraphy images only (recommended)

Place clean calligraphy images in a directory and the system will synthetically damage them:

# Generate paired data from your clean images
python damage_generator.py \
    --source_dir /path/to/your/clean/calligraphy/ \
    --output_dir data \
    --num_train 5000

# Or train with on-the-fly damage
python train.py \
    --data_dir data \
    --on_the_fly_damage

Option B: Pre-paired damaged/clean images

Organize your data as:

data/
β”œβ”€β”€ train/
β”‚   β”œβ”€β”€ clean/001.png, 002.png, ...
β”‚   └── damaged/001.png, 002.png, ...
└── val/
    β”œβ”€β”€ clean/001.png, 002.png, ...
    └── damaged/001.png, 002.png, ...

Filenames must match between clean/ and damaged/.

Tips for Best Results

  1. More data = better results: Aim for 5000+ training pairs minimum
  2. Use your own calligraphy: The model learns styles from training data; train on the style you want to repair
  3. Pathfinding as baseline: Even without the GAN, Stage 1 gives usable structural repairs
  4. Monitor training: Check checkpoints/samples/ for visual progress
  5. Adjust gap distance: --max_gap_distance controls how large a gap the pathfinder will attempt to bridge
  6. GPU recommended: Training takes ~4-8 hours on a single GPU (RTX 3080 or better)

References

  • EdgeConnect β€” Two-stage edge + completion GAN
  • pix2pix β€” Conditional adversarial image-to-image translation
  • DE-GAN β€” Document enhancement with conditional GAN
  • pix2pixHD β€” High-resolution synthesis with feature matching
  • DeepFillv2 β€” Gated convolutions for inpainting
  • DiffHDR β€” Historical document repair with masked perceptual loss

License

MIT

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Papers for AMEND09/calligraphy-repair-cgan