|
--- |
|
license: afl-3.0 |
|
metrics: |
|
- accuracy |
|
pipeline_tag: image-segmentation |
|
--- |
|
# Model Card for Model ID |
|
|
|
This U-Net model classifies each pixel in an rtMRI Video into air or tissue, and we get the Air-Tissue Boundaries. |
|
|
|
### Model Description |
|
he model uses a U-Net architecture with three decoder branches, each consisting of convolutional and upsampling layers. |
|
The encoder consists of convolutional and downsampling layers, followed by a bottleneck layer. |
|
The three decoder branches share the same encoder and bottleneck layers, but have different upsampling and convolutional layers. |
|
Each decoder branch produces a mask for a different class, with the final output being a 3D tensor with shape (batch_size, height, width, n_labels). |
|
|
|
|
|
- **Developed by:** Vinayaka Hegde , during my internship at Signal Processing Interpretation and Representation (SPIRE) |
|
, Lab , at the Indian Institute of Science, Bengaluru |
|
- **Model type:** U-Net |
|
- **Language(s) (NLP):** N/A |
|
- **License:** Apache 2.0 |
|
- **Finetuned from model** N/A |
|
|
|
### Model Sources [optional] |
|
|
|
- **Repository:** vinster619/UNet_USC_TIMIT |
|
- **Paper [optional]:** |
|
- **Demo [optional]:** |
|
|
|
## Uses |
|
This pre-trained U-Net model was trained on a dataset comprising videos 342 and 391 from each speaker present in the 10-speaker USC-TIMIT Corpus (Total 20 Videos). |
|
The model is designed to classify each pixel in an rtMRI video as either air or tissue. |
|
Three distinct masks were used to train the model. |
|
|
|
### Direct Use |
|
|
|
3 Segmented binary masks , and their corresponding "contours" can be accurately segmented for any rtMRI video within the USC-TIMIT Corpus. |
|
|
|
|
|
### Downstream Use [optional] |
|
|
|
This model can be fine-tuned to work properly on other subjects of otehr rtMRI Datasets by finetuning using aprroximately 10-15 frames of any new subject the segmentation has to be performed on. |
|
|
|
### Out-of-Scope Use |
|
|
|
The model will accurately perform segmentation ONLY on videos from the USC-TIMIT Corpus. To accurately perform segmentation on videos if subjects from other rtMRI datasets, fine-tuning using frames from the new subject is required. |
|
|
|
## How to Get Started with the Model |
|
|
|
Please run the inference.py code , to acces the uploaded weights on this repository and obtain an output video file with the segmented Air-Tissue boundaries. |
|
|
|
## Training Details |
|
|
|
Data:USC-TIMIT Corpus (https://sail.usc.edu/span/usc-timit/) |
|
Training set size: 2 Videos per subject from each of the 10 subjects present in the dataset |
|
Validation set size: 1 Video per subject from each of the 10 subjects present in the dataset |
|
Model Architecture: |
|
Optimizer: Adam |
|
Loss Function: Binary Crossentropy |
|
Epochs: 30 , EarlyStopping used |
|
Batch Size: 8 |
|
Evaluation Metrics: Pixel Classification Accuracy, Dice Coefficient |
|
Validation Split: Specify the proportion of the data used for validation (based on the split between train_matrix and val_matrix) |
|
Hardware: NVIDIA GeForce RTX 4060 Laptop GPU |
|
|
|
## Model Card Authors |
|
Vinayaka Hegde |
|
|
|
## Model Card Contact |
|
vinayakahegde619@gmail.com |