vinster619
/

UNet_USC_TIMIT

Image Segmentation

Model card Files Files and versions Community

UNet_USC_TIMIT / README.md

vinster619's picture

Update README.md

b94a10d verified about 2 months ago

|

3.01 kB

	---
	license: afl-3.0
	metrics:
	- accuracy
	pipeline_tag: image-segmentation
	---
	# Model Card for Model ID

	This U-Net model classifies each pixel in an rtMRI Video into air or tissue, and we get the Air-Tissue Boundaries.

	### Model Description
	he model uses a U-Net architecture with three decoder branches, each consisting of convolutional and upsampling layers.
	The encoder consists of convolutional and downsampling layers, followed by a bottleneck layer.
	The three decoder branches share the same encoder and bottleneck layers, but have different upsampling and convolutional layers.
	Each decoder branch produces a mask for a different class, with the final output being a 3D tensor with shape (batch_size, height, width, n_labels).


	- Developed by: Vinayaka Hegde , during my internship at Signal Processing Interpretation and Representation (SPIRE)
	, Lab , at the Indian Institute of Science, Bengaluru
	- Model type: U-Net
	- Language(s) (NLP): N/A
	- License: Apache 2.0
	- Finetuned from model N/A

	### Model Sources [optional]

	- Repository: vinster619/UNet_USC_TIMIT
	- Paper [optional]:
	- Demo [optional]:

	## Uses
	This pre-trained U-Net model was trained on a dataset comprising videos 342 and 391 from each speaker present in the 10-speaker USC-TIMIT Corpus (Total 20 Videos).
	The model is designed to classify each pixel in an rtMRI video as either air or tissue.
	Three distinct masks were used to train the model.

	### Direct Use

	3 Segmented binary masks , and their corresponding "contours" can be accurately segmented for any rtMRI video within the USC-TIMIT Corpus.


	### Downstream Use [optional]

	This model can be fine-tuned to work properly on other subjects of otehr rtMRI Datasets by finetuning using aprroximately 10-15 frames of any new subject the segmentation has to be performed on.

	### Out-of-Scope Use

	The model will accurately perform segmentation ONLY on videos from the USC-TIMIT Corpus. To accurately perform segmentation on videos if subjects from other rtMRI datasets, fine-tuning using frames from the new subject is required.

	## How to Get Started with the Model

	Please run the inference.py code , to acces the uploaded weights on this repository and obtain an output video file with the segmented Air-Tissue boundaries.

	## Training Details

	Data:USC-TIMIT Corpus (https://sail.usc.edu/span/usc-timit/)
	Training set size: 2 Videos per subject from each of the 10 subjects present in the dataset
	Validation set size: 1 Video per subject from each of the 10 subjects present in the dataset
	Model Architecture:
	Optimizer: Adam
	Loss Function: Binary Crossentropy
	Epochs: 30 , EarlyStopping used
	Batch Size: 8
	Evaluation Metrics: Pixel Classification Accuracy, Dice Coefficient
	Validation Split: Specify the proportion of the data used for validation (based on the split between train_matrix and val_matrix)
	Hardware: NVIDIA GeForce RTX 4060 Laptop GPU

	## Model Card Authors
	Vinayaka Hegde

	## Model Card Contact
	vinayakahegde619@gmail.com