Introduction
This is a U-net segmentation model as described in the book "Hands on Neural Network with Tensorflow 2" (chapter 8), and its training / evaluation procedures. The book proposes a (very) simplified version of the orginal U-net architecture, and do not use pre-trained model basis. In consequence, the model is trained from scratch and produce relatively bad performances. The main objectives here are not the model performances but to study the model architecture, proposing an architecture that produces an output with the same resolution as the input, working on training from scratch.
I introduced some small improvements in the model definition by adding regularisations and batch normalisation, and also in the training procedure by adding a learning rate callback.
I share the weights here for people interested in studing it.
Disclaimer
This notebook is shared for study purposes and the model produced with the following code should not be used for real applications. The model definition does not use a pretrained basis, so the training is done from scratch by using a small dataset. Without a pretrained bassis, this dataset does not provide enough information to train a model with good performances.
Training procedure used
I used the dataset Oxford-IIIT pet dataset to train the model.
The "UNET" model takes in input color images of size 128x128, and produce in outputs segmentation masks of size 128x128, with 3 classes (background, pet border, pet).
Data pipelines use a batch size of 128 (images), are augmented with horizontal random flips. The optimiser used is ADAM (with default keras parameters), the loss is a sparse categorical cross entropy.
Trained on 100 epochs with a learning rate adapter: see the notebook for details.