MidasMap / docs /MidasMap_Presentation.md

Upload docs/MidasMap_Presentation.md with huggingface_hub

d1fe61c verified 8 days ago

preview code

raw

history blame contribute delete

9.9 kB

MidasMap: Automated Immunogold Particle Detection for TEM Synapse Images

The Problem

Neuroscientists use immunogold labeling to visualize receptor proteins at synapses in transmission electron microscopy (TEM) images.

6nm gold beads label AMPA receptors (panAMPA)
12nm gold beads label NR1 (NMDA) receptors
18nm gold beads label vGlut2 (vesicular glutamate transporter)

Manual counting is slow and subjective. Each image takes 30-60 minutes to annotate. With hundreds of synapses per experiment, this becomes a bottleneck.

The Challenge

Particles are tiny (4-10 pixels radius) on 2048x2115 images
Contrast delta is only 11-39 intensity units on a 0-255 scale
Large dark vesicles look similar to gold particles to naive detectors
Only 453 labeled particles across 10 training images

Previous Approaches (GoldDigger et al.)

Approach	Result
CenterNet (initial attempt)	"Detection quality remained poor"
U-Net heatmap	Macro F1 = 0.005-0.017
GoldDigger/cGAN	"No durable breakthrough"
Aggressive filtering	"FP dropped but TP dropped harder"

Core issue: Previous systems failed due to:

Incorrect coordinate conversion (microns treated as normalized values)
Broken loss function (heatmap peaks not exactly 1.0)
Overfitting to fixed training patches

MidasMap Architecture

Input: Raw TEM Image (any size)
         |
    [Sliding Window → 512x512 patches]
         |
    ResNet-50 Encoder (pretrained on CEM500K: 500K EM images)
         |
    BiFPN Neck (bidirectional feature pyramid, 2 rounds, 128ch)
         |
    Transposed Conv Decoder → stride-2 output
         |
    +------------------+-------------------+
    |                  |                   |
    Heatmap Head       Offset Head
    (2ch sigmoid)      (2ch regression)
    6nm channel        sub-pixel x,y
    12nm channel       correction
    |                  |
    +------------------+-------------------+
         |
    Peak Extraction (max-pool NMS)
         |
    Cross-class NMS + Mask Filter
         |
    Output: [(x, y, class, confidence), ...]

Key Design Decisions

CEM500K Backbone: ResNet-50 pretrained on 500,000 electron microscopy images via self-supervised learning. The backbone already understands EM structures (membranes, vesicles, organelles) before seeing any gold particles. This is why the model reaches F1=0.93 in just 5 epochs.

Stride-2 Output: Standard CenterNet uses stride 4. At stride 4, a 6nm bead (4-6px radius) collapses to 1 pixel — too small to detect reliably. At stride 2, the same bead occupies 2-3 pixels, enough for Gaussian peak detection.

CornerNet Focal Loss: With positive:negative pixel ratio of 1:23,000, standard BCE would learn to predict all zeros. The focal loss uses (1-p)^alpha weighting to focus on hard examples and (1-gt)^beta penalty reduction near peaks.

Raw Image Input: No preprocessing. The CEM500K backbone was trained on raw EM images. Any heavy preprocessing (top-hat, CLAHE) creates a domain gap and hurts performance. The model learns to distinguish particles from vesicles through training data, not handcrafted filters.

Training Strategy

3-Phase Training with Discriminative Learning Rates

Phase	Epochs	What's Trainable	Learning Rate
1. Warm-up	40	BiFPN + heads only	1e-3
2. Deep unfreeze	40	+ layer3 + layer4	1e-5 to 5e-4
3. Full fine-tune	60	All layers	1e-6 to 2e-4

Loss Curve (final model):

Phase 1          Phase 2          Phase 3
|                |                |
1.4 |\           |                |
    | \          |                |
1.0 |  \         |                |
    |   ----     |                |
0.8 |       \    |                |
    |        \   |                |
0.6 |         \--+---             |
    |            |   \            |
0.4 |            |    \---        |
    |            |        \-------+---
0.2 |            |                |
    +---+---+----+---+---+----+---+---+--> Epoch
    0   10  20   40  50  60   80  100 140

Data Augmentation

Random 90-degree rotations (EM is rotation-invariant)
Horizontal/vertical flips
Conservative brightness/contrast (+-8% — preserves the subtle particle signal)
Gaussian noise (simulates shot noise)
Copy-paste augmentation: real bead crops blended onto training patches
70% hard mining: patches centered on particles, 30% random

Overfitting Prevention

Unique patches every epoch: RNG reseeded per sample so the model never sees the same patch twice
Early stopping: patience=20 epochs, monitoring validation F1
Weight decay: 1e-4 on all parameters

Critical Bugs Found and Fixed

Bug 1: Coordinate Conversion

Problem: CSV files labeled "XY in microns" were assumed to be normalized [0,1] coordinates. They were actual micron values.

Effect: All particle annotations were offset by 50-80 pixels from the real locations. The model was learning to detect particles where none existed.

Fix: Multiply by 1790 px/micron (verified against researcher's color overlay TIFs across 7 synapses).

Bug 2: Heatmap Peak Values

Problem: Gaussian peaks were centered at float coordinates, producing peak values of 0.78-0.93 instead of exactly 1.0.

Effect: The CornerNet focal loss uses pos_mask = (gt == 1.0) to identify positive pixels. With no pixels at exactly 1.0, the model had zero positive training signal. It literally could not learn.

Fix: Center Gaussians at the integer grid point (always produces 1.0). Sub-pixel precision is handled by the offset regression head.

Bug 3: Overfitting on Fixed Patches

Problem: The dataset generated 200 random patches once at initialization. Every epoch replayed the same patches.

Effect: On fast CUDA GPUs, the model memorized all patches in ~17 epochs (loss crashed from 1.6 to 0.002). Validation F1 peaked at 0.66 and degraded.

Fix: Reseed RNG per __getitem__ call so every patch is unique.

Results

Leave-One-Image-Out Cross-Validation (10 folds, 5 seeds each)

Fold	Avg F1	Best F1	Notes
S27	0.990	0.994
S8	0.981	0.988
S25	0.972	0.977
S29	0.956	0.966
S1	0.930	0.940
S4	0.919	0.972
S22	0.907	0.938
S13	0.890	0.912
S7	0.799	1.000	Only 3 particles (noisy metric)
S15	0.633	0.667	Only 1 particle (noisy metric)

Mean F1 = 0.943 (8 folds with sufficient annotations)

Per-class Performance (S1 fold, best threshold)

Class	Precision	Recall	F1
6nm (AMPA)	0.895	1.000	0.944
12nm (NR1)	0.833	1.000	0.909

100% recall on both classes — every particle is found. Only errors are a few false positives.

Generalization to Unseen Images

Tested on 15 completely unseen images from a different imaging session. Detections land correctly on particles with no manual tuning. The model successfully detects both 6nm and 12nm particles on:

Wild-type (Wt2) samples
Heterozygous (Het1) samples
Different synapse regions (D1, E3, S1, S10, S12, S18)

System Components

MidasMap/
  config/config.yaml        # All hyperparameters
  src/
    preprocessing.py        # Data loading (10 synapses, 453 particles)
    model.py                # CenterNet: ResNet-50 + BiFPN + heads (24.4M params)
    loss.py                 # CornerNet focal loss + offset regression
    heatmap.py              # GT generation + peak extraction + NMS
    dataset.py              # Patch sampling, augmentation, copy-paste
    postprocess.py          # Mask filter, cross-class NMS
    ensemble.py             # D4 TTA + sliding window inference
    evaluate.py             # Hungarian matching, F1/precision/recall
    visualize.py            # Overlay visualizations
  train.py                  # LOOCV training (--fold, --seed)
  train_final.py            # Final deployable model (all data)
  predict.py                # Inference on new images
  evaluate_loocv.py         # Full evaluation runner
  app.py                    # Gradio web dashboard
  slurm/                    # HPC job scripts
  tests/                    # 36 unit tests

Dashboard

MidasMap includes a web-based dashboard (Gradio) for interactive use:

Upload any TEM image (.tif)
Adjust confidence threshold and NMS parameters
View detections overlaid on the image
Inspect per-class heatmaps
Analyze confidence distributions and spatial patterns
Export results as CSV (particle_id, x_px, y_px, x_um, y_um, class, confidence)

python app.py --checkpoint checkpoints/final/final_model.pth
# Opens at http://localhost:7860

Future Directions

Spatial analytics: distance to synaptic cleft, nearest-neighbor analysis, Ripley's K-function
Size regression head: predict actual bead diameter instead of binary classification
18nm detection: extend to vGlut2 particles (3-class model)
Active learning: flag low-confidence detections for human review
Cross-protocol generalization: fine-tune on cryo-EM or different staining protocols

Technical Summary

Model: CenterNet with CEM500K-pretrained ResNet-50, BiFPN neck, stride-2 output
Training: 3-phase with discriminative LRs, 140 epochs, 453 particles / 10 images
Evaluation: Leave-one-image-out CV, Hungarian matching, F1 = 0.943
Inference: Sliding window (512x512, 128px overlap), ~10s per image on GPU
Output: Per-particle (x, y, class, confidence) with optional heatmap visualization