TubuleSegmentation v1.0.0 — Seminiferous tubule segmentation and morphometry (mouse testis, H&E)

Semantic segmentation model for seminiferous tubules in H&E sections of mouse testis (Mus musculus, CF-1 strain). Segments 3 classes (0 = background, 1 = epithelium, 2 = lumen) and derives calibrated morphometric metrics. Architecture: EfficientNet-B4 encoder + UNet with a dual decoder (segmentation + boundary) and SCSE attention. Input 512×512, ImageNet normalization, scale 0.32 µm/px.

This release (v1.0.0) is the result of ~23 internal iterations of architecture and training design.

1. Segmentation performance (validation, n = 49)

Class	IoU	Dice / F1
Background (0)	0.989	0.994 †
Epithelium (1)	0.935	0.966 †
Lumen (2)	0.931	0.964 †
Mean	mIoU 0.952	Dice 0.975 †

Best checkpoint: epoch 47

Metrics reported on the validation set, never on train.

2. Agreement with manual measurement (ImageJ, gold standard) — validation, n = 39

Method comparison (predicted vs. manual freehand tracing). Bias = predicted − manual. 95% LoA = Bland-Altman limits of agreement.

Tubule

Metric	Pearson r	CCC (Lin)	Bias	Bias %	95% LoA	MAE
Area (µm²)	0.996	0.78	+4043.8	+11.9%	[2379.6, 5708.0]	4043.8
Major axis (µm)	0.996	0.83	+14.29	+6.2%	[8.71, 19.86]	14.29
Minor axis (µm)	0.995	0.80	+11.71	+6.2%	[8.03, 15.38]	11.71
Max Feret (µm)	0.997	0.85	+13.50	+5.8%	[8.15, 18.85]	13.50
Min Feret (µm)	0.995	0.82	+11.05	+5.9%	[7.22, 14.89]	11.05
Perimeter (µm)	0.996	0.79	+37.73	+5.7%	[25.38, 50.07]	37.73
Aspect ratio	0.999	1.00	−0.0001	+0.4%	[−0.013, 0.013]	0.005
Roundness	0.997	0.99	−0.007	+0.9%	[−0.020, 0.007]	0.007

Lumen

Metric	Pearson r	CCC (Lin)	Bias	Bias %	95% LoA	MAE
Area (µm²)	0.993	0.95	+1080.3	+8.1%	[8.5, 2152.2]	1080.3
Major axis (µm)	0.970	0.83	+12.19	+8.3%	[1.43, 22.96]	12.19
Minor axis (µm)	0.988	0.88	+9.86	+8.7%	[3.59, 16.14]	9.86
Max Feret (µm)	0.974	0.96	+4.51	+3.7%	[−6.51, 15.54]	6.36
Min Feret (µm)	0.984	0.98	+2.70	+3.1%	[−4.88, 10.28]	4.02
Perimeter (µm)	0.926	0.93	−4.06	+6.0%	[−113.88, 105.77]	41.90
Aspect ratio	0.972	0.97	−0.009	+3.0%	[−0.118, 0.101]	0.040
Roundness	0.931	0.80	−0.061	+8.0%	[−0.147, 0.024]	0.062

Shape vs. size: dimensionless shape descriptors (aspect ratio, roundness) reach CCC ≈ 0.99 in tubule and 0.97 in lumen. The model captures morphology almost perfectly; the ~8–12% area bias is an approximately uniform overestimation that preserves shape rather than distorting it. Per-metric Bland-Altman plots in imagej_validation/.

3. Data and training

Item	Value
Tissue / species	Mouse testis (Mus musculus), CF-1 strain; seminiferous tubules, H&E
Dataset	`LuGot16/tubules` (HuggingFace)
Train	273 images — 11 animals, 2 slides per animal
Train / Validation	273 / 49 images, random 85/15 split (seed 42)
Split level	Image-level random split (not grouped by animal)
Scale	0.32 µm/px
Input	512×512, ImageNet normalization
Classes	0 = background, 1 = epithelium, 2 = lumen
Augmentation	Macenko (stain normalization) + geometric
Post-processing	largest connected component → morphological closing → hole filling → lumen cleanup
Inference	8× TTA (4 rotations × 2 flips)

4. Robustness, known biases, and scope

Generalization (animals / stain intensities / fixative):

Note on validation: the train/validation split is random at the image level, not grouped by animal — so images from the same animal may appear in both sets. The validation metrics in Sections 1–2 may therefore be optimistic with respect to generalization to fully unseen animals. The strongest evidence for cross-animal generalization is the external test below (129 images from other animals and experiments, with different stains and fixatives).
External test: 129 images from other animals and different experiments (the training set comprises 11 animals across 3 experiments; the external images come from outside that set), with widely varying stain intensities and different fixatives. Satisfactory segmentation on 128/129; the single failure was a patch of extremely thin epithelium left unsegmented in one image. Macenko stain normalization during training contributes to this tolerance to stain and fixation variation.

Known biases:

Systematic area overestimation: ~+12% (tubule) and ~+8% (lumen) vs. manual tracing. It is systematic (r ≈ 0.99), attributable to the boundary convention (the model mask is slightly more generous than the manual freehand trace), not noise. It preserves shape (AR/roundness CCC ≈ 0.99). Do not correct with a factor derived from the validation set itself.
Lumen perimeter: good mean agreement (CCC 0.93, bias ≈ 0) but wide LoA (±110 µm): varies case by case due to tracing convention.

Validated scope: single-tubule crops at 0.32 µm/px, H&E staining. Not validated on multi-tubule fields or other magnifications.

Attribution

Dataset curated by Lucila Gotfryd (image acquisition, annotation, design of the segmentation and morphometry approach, including the anatomically-motivated containment constraint and the choice to enforce tubular connectivity). Model implementation carried out with AI coding assistance under the author's direction.

license: apache-2.0 pipeline_tag: image-segmentation

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support