|
--- |
|
license: mit |
|
library_name: pytorch |
|
tags: |
|
- Medical Vsion-Language Pre-Training |
|
- BenchX |
|
--- |
|
|
|
# MGCA-ResNet50 Checkpoint Model Card |
|
|
|
A retrained MGCA-ResNet50 model for benchmarking medical vision-language pre-training methods within the BenchX framework. |
|
|
|
## Model Details |
|
- **Model Type**: MGCA-ResNet50 |
|
- **Architecture**: ResNet-50 image encoder and BioClinicalBERT text encoder |
|
- **Original Papers**: [Multi-Granularity Cross-modal Alignment for Generalized Medical Visual Representation Learning](https://arxiv.org/abs/2210.06044) |
|
- **Benchmark Paper**: [BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays](https://arxiv.org/abs/2410.21969) |
|
- **Benchmark Framework**: https://github.com/yangzhou12/BenchX |
|
|
|
## Intended Use |
|
- **Primary Use Cases**: |
|
- Benchmarking performance for Medical Image Classification |
|
- Benchmarking performance for Medical Image Segmentation |
|
- Benchmarking performance for Medical Report Generation |
|
|
|
## Pre-Training Data |
|
- **Dataset**: |
|
- Data source(s): MIMIC-CXR |
|
- Types of medical images: Frontal chest X-rays |
|
- Text data type: Associated radiology reports |
|
|
|
## Prerequisites |
|
|
|
Please follow the [instruction](https://github.com/yangzhou12/BenchX/blob/release/README.md#installation) to install BenchX. |
|
|
|
## Training & Evaluation |
|
|
|
### 1. Classification |
|
|
|
To fine-tune MGCA-ResNet50 for classification, run this command: |
|
|
|
``` |
|
python bin/train.py config/classification/<dataset_name>/mgca_resnet50.yml |
|
``` |
|
|
|
### 2. Segmentation |
|
To fine-tune MGCA-ResNet50 for segmentation, run this command: |
|
|
|
``` |
|
python mmsegmentation/tools/train.py config/benchmark/<dataset_name>/mgca_resnet50.yml |
|
``` |
|
|
|
### 3. Report Generation |
|
To fine-tune MGCA-ResNet50 for report generation, run this command: |
|
``` |
|
python bin/train.py config/report_generation/<dataset_name>/mgca_resnet50.yml |
|
``` |
|
|
|
### 4. Evaluation |
|
To evaluate fine-tuned MGCA-ResNet50 models, run: |
|
|
|
``` |
|
# For classification and report generation |
|
python bin/test.py config/<task_name>/<dataset_name>/mgca_resnet50.yml validator.splits=[test] ckpt_dir=<path_to_checkpoint> |
|
|
|
# For segmentation |
|
python mmsegmentation/tools/my_test.py mmsegmentation/config/<dataset_name>/mgca_resnet50.yml <path_to_checkpoint> |
|
``` |
|
|
|
## Citations |
|
```bibtex |
|
@article{wang2022multi, |
|
title={Multi-granularity cross-modal alignment for generalized medical visual representation learning}, |
|
author={Wang, Fuying and Zhou, Yuyin and Wang, Shujun and Vardhanabhuti, Varut and Yu, Lequan}, |
|
journal={Advances in NeurIPS}, |
|
volume={35}, |
|
pages={33536--33549}, |
|
year={2022} |
|
} |
|
``` |
|
```bibtex |
|
@inproceedings{zhou2024benchx, |
|
title={BenchX: A Unified Benchmark Framework for Medical Vision-Language Pretraining on Chest X-Rays}, |
|
author={Yang Zhou, Tan Li Hui Faith, Yanyu Xu, Sicong Leng, Xinxing Xu, Yong Liu, Rick Siow Mong Goh}, |
|
booktitle={Proceedings of NeurIPS}, |
|
year={2024} |
|
} |
|
``` |