YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
ποΈ Calligraphy Repair cGAN
A two-stage system for repairing damaged handwriting and calligraphy using combined pathfinding + conditional GAN approach.
Stage 1 uses a deterministic A*/Bezier pathfinding algorithm to repair large structural gaps in strokes. Stage 2 uses a conditional GAN to re-apply the artistic style, producing visually coherent results.
Architecture Overview
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β INPUT: Damaged Calligraphy β
β (RGB, HΓWΓ3) β
ββββββββββββββββββββββββββ¬βββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
β STAGE 1: PATHFINDING β
β (Deterministic Algorithm) β
β β
β 1. Binarize β ink mask β
β 2. Skeletonize (Zhang-Suen) β
β 3. Find gap endpoints β
β 4. A*/Bezier gap bridging β
β 5. Variable-width rendering β
βββββββββ¬ββββββββββββ¬ββββββββββββ
β β
stroke_mask gap_mask
(1,H,W) (1,H,W)
β β
βββββββββΌββββββββββββΌββββββββββββ
β STAGE 2: cGAN REFINEMENT β
β (Learned Style Transfer) β
β β
β Generator Input (5ch): β
β [damaged_RGB(3) + β
β stroke_mask(1) + β
β gap_mask(1)] β
β β
β βββββββββββββββββββββββββββ β
β β U-Net Generator β β
β β + Gated Convolutions β β
β β + 8 Dilated ResBlocks β β
β β + Skip Connections β β
β βββββββββββββββββββββββββββ β
β β
β βββββββββββββββββββββββββββ β
β β 70Γ70 SN-PatchGAN β β
β β Discriminator β β
β β (Spectral Normalized) β β
β βββββββββββββββββββββββββββ β
βββββββββββββββββ¬ββββββββββββββββ
β
βββββββββββββββββΌββββββββββββββββ
β OUTPUT: Repaired Image β
β (RGB, HΓWΓ3) β
βββββββββββββββββββββββββββββββββ
Key Features
- π Pathfinding Stage: A*/Bezier curve gap bridging with direction-aware endpoint matching, variable stroke thickness estimation, and smooth calligraphic curve generation
- π¨ cGAN Stage: EdgeConnect-inspired architecture with gated convolutions, dilated residual blocks, and multi-loss training (adversarial + L1 + perceptual VGG + style Gram + feature matching)
- π Synthetic Data Pipeline: Generates training pairs from clean calligraphy with realistic damage (erosion, gaps, fading, bleeding, stains, scratches)
- β‘ Flexible: Works with any writing system (Latin, Chinese, Arabic, Japanese, etc.)
Installation
pip install -r requirements.txt
Quick Start
1. Generate Training Data
# Generate 5000 synthetic training pairs
python damage_generator.py \
--output_dir data \
--num_train 5000 \
--num_val 500 \
--image_size 256
# Or use your own clean calligraphy images:
python damage_generator.py \
--output_dir data \
--source_dir /path/to/clean/calligraphy/ \
--num_train 5000
This creates:
data/
βββ train/
β βββ clean/ # Ground truth images
β βββ damaged/ # Synthetically damaged images
β βββ mask/ # Damage location masks
βββ val/
βββ clean/
βββ damaged/
βββ mask/
2. Train the Model
# Full training (recommended)
python train.py \
--data_dir data \
--epochs 200 \
--batch_size 8 \
--image_size 256 \
--generator_type unet \
--gan_type lsgan \
--lr_g 1e-4 \
--lr_d 1e-4
# Quick test run
python train.py \
--data_dir data \
--epochs 5 \
--batch_size 4 \
--generate_data \
--num_train 100 \
--num_val 20
# With on-the-fly damage generation (only needs clean/ directory)
python train.py \
--data_dir data \
--on_the_fly_damage \
--epochs 200
# Resume from checkpoint
python train.py \
--data_dir data \
--resume checkpoints/best_model.pth \
--epochs 300
3. Repair Damaged Images
# With trained GAN (best quality)
python inference.py \
--input damaged_calligraphy.png \
--output repaired.png \
--checkpoint checkpoints/generator_best.pth
# Pathfinding only (no GAN needed, instant)
python inference.py \
--input damaged_calligraphy.png \
--output repaired.png \
--pathfinding_only
# Batch repair a directory
python inference.py \
--input_dir damaged_images/ \
--output_dir repaired_images/ \
--checkpoint checkpoints/generator_best.pth
# Save all intermediate stages for visualization
python inference.py \
--input damaged_calligraphy.png \
--output repaired.png \
--checkpoint checkpoints/generator_best.pth \
--save_stages
Training Recipe
Based on literature from EdgeConnect, pix2pix, DE-GAN, and pix2pixHD:
| Parameter | Value | Source |
|---|---|---|
| Optimizer | Adam(Ξ²1=0.0, Ξ²2=0.9) | EdgeConnect |
| Learning Rate | 1e-4 (G and D) | EdgeConnect |
| LR Schedule | Constant first half, linear decay second half | pix2pix |
| GAN Type | LSGAN (MSE loss) | pix2pixHD |
| Ξ»_adversarial | 1.0 | EdgeConnect |
| Ξ»_L1 | 100.0 | pix2pix |
| Ξ»_feature_matching | 10.0 | pix2pixHD |
| Ξ»_perceptual | 0.1 | EdgeConnect |
| Ξ»_style | 250.0 | EdgeConnect |
| Ξ»_masked_L1 | 50.0 | DiffHDR |
| Image Size | 256Γ256 | Standard |
| Batch Size | 8 | EdgeConnect |
| Epochs | 200 | pix2pix |
| Weight Init | Gaussian(0, 0.02) | pix2pix |
Loss Functions
The combined generator loss:
L_G = Ξ»_adv Β· L_adversarial (fool the discriminator)
+ Ξ»_L1 Β· L_L1 (pixel-level reconstruction)
+ Ξ»_FM Β· L_feature_matching (stabilize training via D features)
+ Ξ»_perc Β· L_perceptual (VGG relu1_2, relu2_2, relu3_3, relu4_3)
+ Ξ»_style Β· L_style (Gram matrices for texture matching)
+ Ξ»_mask Β· L_masked_L1 (focus on damaged regions)
The discriminator loss:
L_D = L_adversarial_D + Ξ»_R1 Β· L_R1_regularization
Architecture Details
Generator: U-Net with Dilated Residual Bottleneck
Encoder:
[GatedConv 5β64, 7Γ7, stride 1] # Level 0
[GatedConv 64β128, 4Γ4, stride 2] # Level 1 (β2Γ)
[GatedConv 128β256, 4Γ4, stride 2] # Level 2 (β2Γ)
Bottleneck:
8Γ [DilatedResBlock 256β256, dilation=2] # ~200px receptive field
Decoder:
[ConvTranspose 512β128, 4Γ4, stride 2] # Skip from Level 2 (β2Γ)
[ConvTranspose 256β64, 4Γ4, stride 2] # Skip from Level 1 (β2Γ)
[Conv 128β3, 7Γ7, Tanh] # Skip from Level 0
All layers: InstanceNorm + ReLU (encoder: LeakyReLU)
Input: 5 channels (damaged_RGB + stroke_mask + gap_mask)
Output: 3 channels (repaired RGB in [-1, 1])
Discriminator: 70Γ70 SN-PatchGAN
C64(no norm) β C128 β C256 β C512 β Convβ1
All convs: 4Γ4, spectral normalized
InstanceNorm on layers 2-4
LeakyReLU(0.2) throughout
Input: 8 channels (damaged + output + masks)
Output: H/16 Γ W/16 patch predictions
Pathfinding Algorithm
The deterministic Stage 1 pipeline:
- Binarization: Otsu's adaptive thresholding + morphological cleanup
- Skeletonization: Zhang-Suen thinning to 1-pixel-wide skeleton
- Thickness Estimation: Distance transform to measure local stroke width
- Endpoint Detection: Find degree-1 nodes (gap openings) via 3Γ3 convolution kernel
- Direction Analysis: Trace back along skeleton to compute tangent direction at each endpoint
- Endpoint Matching: Score pairs by distance + direction alignment + collinearity; greedy matching
- Gap Bridging:
- A* mode: Cost = distance + direction_continuity + curvature_penalty + ink_proximity
- Bezier mode: Cubic Bezier with control points guided by endpoint tangents
- Stroke Rendering: Variable-width circular brush matching estimated local thickness
Project Structure
βββ pathfinding.py # Stage 1: Deterministic gap repair
βββ damage_generator.py # Synthetic training data generation
βββ models.py # cGAN architecture (Generator + Discriminator)
βββ losses.py # Loss functions + metrics
βββ dataset.py # Dataset loader with pathfinding integration
βββ train.py # Training pipeline
βββ inference.py # Inference / repair script
βββ requirements.txt # Dependencies
βββ README.md # This file
Using Your Own Data
Option A: Clean calligraphy images only (recommended)
Place clean calligraphy images in a directory and the system will synthetically damage them:
# Generate paired data from your clean images
python damage_generator.py \
--source_dir /path/to/your/clean/calligraphy/ \
--output_dir data \
--num_train 5000
# Or train with on-the-fly damage
python train.py \
--data_dir data \
--on_the_fly_damage
Option B: Pre-paired damaged/clean images
Organize your data as:
data/
βββ train/
β βββ clean/001.png, 002.png, ...
β βββ damaged/001.png, 002.png, ...
βββ val/
βββ clean/001.png, 002.png, ...
βββ damaged/001.png, 002.png, ...
Filenames must match between clean/ and damaged/.
Tips for Best Results
- More data = better results: Aim for 5000+ training pairs minimum
- Use your own calligraphy: The model learns styles from training data; train on the style you want to repair
- Pathfinding as baseline: Even without the GAN, Stage 1 gives usable structural repairs
- Monitor training: Check
checkpoints/samples/for visual progress - Adjust gap distance:
--max_gap_distancecontrols how large a gap the pathfinder will attempt to bridge - GPU recommended: Training takes ~4-8 hours on a single GPU (RTX 3080 or better)
References
- EdgeConnect β Two-stage edge + completion GAN
- pix2pix β Conditional adversarial image-to-image translation
- DE-GAN β Document enhancement with conditional GAN
- pix2pixHD β High-resolution synthesis with feature matching
- DeepFillv2 β Gated convolutions for inpainting
- DiffHDR β Historical document repair with masked perceptual loss
License
MIT