Spaces:

Xiaomeng1130
/

Stoma-clip-api

Runtime error

App Files Files Community

Xiaomeng1130 commited on about 1 month ago

Commit

8274db5

verified ·

1 Parent(s): 6ecc5ed

Upload 9 files

Browse files

Files changed (9) hide show

.gitignore +140 -0
LICENSE +21 -0
README.md +166 -12
app.py +146 -0
requirements.txt +2 -1
save_roc.py +153 -0
setup.py +57 -0
stoma_clip.pt +3 -0
vis_all_model_roc.py +71 -0

.gitignore ADDED Viewed

	@@ -0,0 +1,140 @@

+# Byte-compiled / optimized / DLL files
+__pycache__/
+*.py[cod]
+*$py.class
+# C extensions
+*.so
+# Distribution / packaging
+.Python
+build/
+develop-eggs/
+dist/
+downloads/
+eggs/
+.eggs/
+lib/
+lib64/
+parts/
+sdist/
+var/
+wheels/
+pip-wheel-metadata/
+share/python-wheels/
+*.egg-info/
+.installed.cfg
+*.egg
+MANIFEST
+# PyInstaller
+#  Usually these files are written by a python script from a template
+#  before PyInstaller builds the exe, so as to inject date/other infos into it.
+*.manifest
+*.spec
+# Installer logs
+pip-log.txt
+pip-delete-this-directory.txt
+# Unit test / coverage reports
+htmlcov/
+.tox/
+.nox/
+.coverage
+.coverage.*
+.cache
+nosetests.xml
+coverage.xml
+*.cover
+*.py,cover
+.hypothesis/
+.pytest_cache/
+# Translations
+*.mo
+*.pot
+# Django stuff:
+*.log
+local_settings.py
+db.sqlite3
+db.sqlite3-journal
+# Flask stuff:
+instance/
+.webassets-cache
+# Scrapy stuff:
+.scrapy
+# Sphinx documentation
+docs/_build/
+# PyBuilder
+target/
+# Jupyter Notebook
+.ipynb_checkpoints
+# IPython
+profile_default/
+ipython_config.py
+# pyenv
+.python-version
+# pipenv
+#   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
+#   However, in case of collaboration, if having platform-specific dependencies or dependencies
+#   having no cross-platform support, pipenv may install dependencies that don't work, or not
+#   install all needed dependencies.
+#Pipfile.lock
+# PEP 582; used by e.g. github.com/David-OConnor/pyflow
+__pypackages__/
+# Celery stuff
+celerybeat-schedule
+celerybeat.pid
+# SageMath parsed files
+*.sage.py
+# Environments
+.env
+.venv
+env/
+venv/
+ENV/
+env.bak/
+venv.bak/
+# Spyder project settings
+.spyderproject
+.spyproject
+# Rope project settings
+.ropeproject
+# mkdocs documentation
+/site
+# mypy
+.mypy_cache/
+.dmypy.json
+dmypy.json
+# Pyre type checker
+.pyre/
+data/
+microsoft/
+ckpt/
+# *.jpg
+*.png
+logs/
+*.json
+evaluation*/
+*.ipynb

LICENSE ADDED Viewed

	@@ -0,0 +1,21 @@

+MIT License
+Copyright (c) 2023 Weixiong Lin
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

README.md CHANGED Viewed

@@ -1,12 +1,166 @@
----
-title: Stoma Clip Api
-emoji: 📊
-colorFrom: yellow
-colorTo: green
-sdk: gradio
-sdk_version: 5.49.1
-app_file: app.py
-pinned: false
----
-Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference

+# PMC-CLIP
+[![Quick Start Demo](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1P7uyzK_Mhu1YyMeRrrRY_e3NpkNBOI4L?usp=sharing)
+[![Dataset and Model](https://img.shields.io/badge/Hugging%20Face-Dataset-green)](https://huggingface.co/datasets/axiong/pmc_oa)
+The dataset and checkpoint is available at [Huggingface](https://huggingface.co/datasets/axiong/pmc-oa), [Baidu Cloud](https://pan.baidu.com/s/1mD51oOYbIOqDJSeiPNaCCg)(key: 3iqf).
+📢 We provide the extracted image encoder and text encoder checkpoint in [Huggingface](https://huggingface.co/datasets/axiong/pmc-oa), and a quick start demo on how to use them in encoding image and text input. Check this [notebook](https://colab.research.google.com/drive/1P7uyzK_Mhu1YyMeRrrRY_e3NpkNBOI4L?usp=sharing)!
+- [PMC-CLIP](#pmc-clip)
+  - [Quick Start Inference](#quick-start-inference)
+  - [Train and Evaluation](#train-and-evaluation)
+    - [1. Create Environment](#1-create-environment)
+    - [2. Prepare Dataset](#2-prepare-dataset)
+    - [3. Training](#3-training)
+    - [4. Evaluation](#4-evaluation)
+  - [Acknowledgement](#acknowledgement)
+  - [Contribution](#contribution)
+  - [TODO](#todo)
+  - [Cite](#cite)
+## Quick Start Inference
+We offer a quick start demo on how to use the image and text encoder of PMC-CLIP. Check this [notebook](https://colab.research.google.com/drive/1P7uyzK_Mhu1YyMeRrrRY_e3NpkNBOI4L?usp=sharing)!
+## Train and Evaluation
+Repo Structure
+```bash
+src/:
+    |--setup.py
+    |--pmc_clip/
+    |   |--loss/
+    |   |--model/: PMC-CLIP model and variants
+    |   |--model_configs/
+    |   |--factory.py: Create model according to configs
+    |   |--transform.py: data augmentation
+    |--training/
+    |   |--main.py
+    |   |--scheduler.py: Learning rate scheduler
+    |   |--train.py
+    |   |--evaluate.py
+    |   |--data.py
+    |   |--params.py
+docs/: project pages
+```
+### 1. Create Environment
+```bash
+conda create -n pmc_clip python=3.8
+conda activate pmc_clip
+pip install -r requirements.txt
+# pip install -i https://pypi.tuna.tsinghua.edu.cn/simple -r requirements.txt
+python setup.py develop  # install pmc_clip with dev mode
+```
+### 2. Prepare Dataset
+Download from [Huggingface](https://huggingface.co/datasets/axiong/pmc-oa), [Baidu Cloud](https://pan.baidu.com/s/1mD51oOYbIOqDJSeiPNaCCg)(key: 3iqf).
+Or follow the [Pipeline of PMC-OA Development](https://github.com/WeixiongLin/Build-PMC-OA) if you want to start from scratch.
+### 3. Training
+Single GPU
+```bash
+python -m training.main \
+--dataset-type "csv" --csv-separator "," --save-frequency 5 \
+--report-to tensorboard \
+--train-data="path/to/train.csv" --val-data="path/to/valid.csv" \
+--csv-img-key image --csv-caption-key caption \
+--warmup 500 --batch-size=8 --lr=1e-4 --wd=0.1 --epochs=100 --workers=8 \
+--model RN50_fusion4 --hugging-face --mlm --crop-scale 0.5
+```
+Multi GPU
+```bash
+CUDA_VISIBLE_DEVICES=0,1 torchrun --nproc_per_node=2 --rdzv_endpoint=$HOSTE_NODE_ADDR -m training.main \
+--dataset-type "csv" --csv-separator "," --save-frequency 5 \
+--report-to tensorboard \
+--train-data="path/to/train.csv" --val-data="path/to/valid.csv" \
+--csv-img-key image --csv-caption-key caption \
+--warmup 500 --batch-size=128 --lr=1e-4 --wd=0.1 --epochs=100 --workers=8 \
+--model RN50_fusion4 --hugging-face --mlm --crop-scale 0.5
+```
+<div class="third">
+  <img src="docs/resources/train_loss.png" style="height:200px">
+  <img src="docs/resources/val_i2t@1.png" style="height:200px">
+  <img src="docs/resources/val_t2i@1.png" style="height:200px">
+</div>
+### 4. Evaluation
+Load checkpoint and eval on 2k samples from testset.
+```bash
+python -m training.main \
+--dataset-type "csv" --csv-separator "," --report-to tensorboard \
+--val-data="path/to/test.csv" \
+--csv-img-key image --csv-caption-key caption \
+--batch-size=32 --workers=8 \
+--model RN50_fusion4 --hugging-face --mlm --crop-scale 0.1 \
+--resume /path/to/checkpoint.pt \
+--test-2000
+```
+Also we provide automatic ways to load model weights from huggingface repo.
+| Model | URL |
+| --- | --- |
+| PMC_CLIP:beta | https://huggingface.co/datasets/axiong/pmc_oa_beta/blob/main/checkpoint.pt |
+Take PMC_CLIP:beta checkpoint as an example:
+```bash
+python -m training.main \
+--dataset-type "csv" --csv-separator "," --report-to tensorboard \
+--val-data="path/to/test.csv" \
+--csv-img-key image --csv-caption-key caption \
+--batch-size=32 --workers=8 \
+--model RN50_fusion4 --hugging-face --mlm --crop-scale 0.1 \
+--resume "PMC_CLIP:beta" \
+--test-2000
+```
+## Acknowledgement
+The code is based on [OpenCLIP](https://github.com/mlfoundations/open_clip) and [M3AE](https://github.com/zhjohnchan/M3AE). We thank the authors for their open-sourced code and encourage users to cite their works when applicable.
+Note that our code don't supported tools like horovod, wandb in OpenCLIP. But we keep the code from OpenCLIP for consistency.
+## Contribution
+Please raise an issue if you need help, any contributions are welcomed.
+## TODO
+* [ ] Compatibility testing on more env settings
+* [ ] Support for horovod, wandb
+## Cite
+```bash
+@article{lin2023pmc,
+  title={PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents},
+  author={Lin, Weixiong and Zhao, Ziheng and Zhang, Xiaoman and Wu, Chaoyi and Zhang, Ya and Wang, Yanfeng and Xie, Weidi},
+  journal={arXiv preprint arXiv:2303.07240},
+  year={2023}
+}
+```
+The paper has been accepted by MICCAI 2023.
+```bash
+@inproceedings{lin2023pmc,
+  title={Pmc-clip: Contrastive language-image pre-training using biomedical documents},
+  author={Lin, Weixiong and Zhao, Ziheng and Zhang, Xiaoman and Wu, Chaoyi and Zhang, Ya and Wang, Yanfeng and Xie, Weidi},
+  booktitle={MICCAI},
+  year={2023}
+}
+```

app.py ADDED Viewed

	@@ -0,0 +1,146 @@

+import os
+import torch
+import gradio as gr
+from PIL import Image
+import numpy as np
+# ========== 1. Import project modules ==========
+try:
+    from stoma_clip import pmc_clip
+    from stoma_clip.pmc_clip.factory import _rescan_model_configs
+    from stoma_clip.training.fusion_method import convert_model_to_cls
+    from stoma_clip.training.dataset.utils import encode_mlm
+except ImportError as e:
+    print(f"Error importing Stoma-CLIP modules: {e}")
+# ========== 2. Model Configuration and Loading ==========
+LABEL_MAP = {
+    "Irritant dermatitis": 0, "Allergic contact dermatitis": 1, "Mechanical injury": 2,
+    "Folliculitis": 3, "Fungal infection": 4, "Skin hyperplasia": 5, "Parastomal varices": 6,
+    "Urate crystals": 7, "Cancerous metastasis": 8, "Pyoderma gangrenosum": 9, "Normal": 10
+}
+REVERSE_LABEL_MAP = {v: k for k, v in LABEL_MAP.items()}
+NUM_CLASSES = len(LABEL_MAP)
+class Args:
+    def __init__(self):
+        self.model = "RN50_fusion4"
+        self.pretrained = "stoma_clip.pt"
+        self.num_classes = NUM_CLASSES
+        self.mlm = True
+        self.crop_scale = 0.9
+        self.context_length = 77
+        self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+        print(f"Using device: {self.device}")
+args = Args()
+MODEL = None
+PREPROCESS = None
+TOKENIZER = None
+def load_model():
+    """Load model once when Gradio starts."""
+    global MODEL, PREPROCESS, TOKENIZER
+    if MODEL is not None:
+        return MODEL, PREPROCESS, TOKENIZER
+    try:
+        _rescan_model_configs()
+        model, _, preprocess = pmc_clip.create_model_and_transforms(args)
+        model = convert_model_to_cls(model, num_classes=args.num_classes, fusion_method='cross_attention')
+        model.to(args.device).eval()
+        state_dict = torch.load(args.pretrained, map_location='cpu')
+        state_dict_clean = {k.replace("module.", "", 1): v for k, v in state_dict['state_dict'].items()}
+        model.load_state_dict(state_dict_clean)
+        tokenizer = model.tokenizer
+        MODEL = model
+        PREPROCESS = preprocess
+        TOKENIZER = tokenizer
+        print("Stoma-CLIP Model loaded successfully!")
+        return MODEL, PREPROCESS, TOKENIZER
+    except Exception as e:
+        print(f"Error during model loading: {e}")
+        MODEL = None
+        raise RuntimeError(f"Failed to load Stoma-CLIP model: {e}")
+# ========== 3. Inference Function ==========
+def predict_stoma_clip(image: Image.Image, caption: str):
+    if MODEL is None:
+        return "Model Loading Failed", {}
+    image = image.convert("RGB")
+    model, preprocess, tokenizer = MODEL, PREPROCESS, TOKENIZER
+    device = args.device
+    image_tensor = preprocess(image).unsqueeze(0).to(device)
+    mask_token, pad_token = '[MASK]', '[PAD]'
+    vocab = [v for v in tokenizer.get_vocab().keys() if v not in tokenizer.all_special_tokens]
+    bert_input, bert_label = encode_mlm(
+        caption=caption,
+        vocab=vocab,
+        mask_token=mask_token,
+        pad_token=pad_token,
+        ratio=0.0,
+        tokenizer=tokenizer,
+        args=args,
+    )
+    with torch.no_grad():
+        inputs = {"images": image_tensor, "bert_input": bert_input, "bert_label": bert_label}
+        outputs = model(inputs)
+        probs = torch.softmax(outputs, dim=1).cpu().numpy()[0]
+        predicted_class_idx = torch.argmax(outputs, dim=1).item()
+    predicted_class_name = REVERSE_LABEL_MAP.get(predicted_class_idx, "Unknown")
+    probability_distribution = {REVERSE_LABEL_MAP[i]: float(p) for i, p in enumerate(probs)}
+    return predicted_class_name, probability_distribution
+# ========== 4. Gradio Interface Setup ==========
+try:
+    load_model()
+    print("模型已在 Gradio 启动前成功加载。")
+except Exception as e:
+    print(f"致命错误：模型未能加载。Gradio 界面将无法运行。{e}")
+image_input = gr.Image(type="pil", label="上传造口图片")
+caption_input = gr.Textbox(label="输入造口描述文本 (例如: Exudate, epidermal breakdown, ...)")
+predicted_label_output = gr.Textbox(label="预测类别")
+prob_output = gr.Label(label="类别概率分布")
+# Find example path for Gradio demo (Note: In the deployed Space, these paths should be relative to the root)
+try:
+    example_path_1 = "demo/Irritant_dermatitis.jpg"
+    example_path_2 = "demo/Folliculitis.jpg"
+    examples_list = []
+    if os.path.exists(example_path_1):
+        examples_list.append(
+            [example_path_1, "Exudate, epidermal breakdown, irregular erythema, pain, confined to contact areas"])
+    elif os.path.exists(example_path_2):
+        examples_list.append([example_path_2, "Erythema, papules, pustules confined to hair follicles"])
+except Exception:
+    examples_list = []
+iface = gr.Interface(
+    fn=predict_stoma_clip,
+    inputs=[image_input, caption_input],
+    outputs=[predicted_label_output, prob_output],
+    title="🧪 Stoma-CLIP 分类 API 原型 (Gradio)",
+    description="请上传造口图片并输入临床描述，模型将预测最可能的皮肤并发症类别。",
+    examples=examples_list,
+    allow_flagging="never"
+)
+if __name__ == "__main__":
+    iface.launch()

requirements.txt CHANGED Viewed

@@ -1,4 +1,4 @@
-pytorch >= 1.9.0
 torchvision
 transformers
 tqdm
@@ -10,3 +10,4 @@ braceexpand
 webdataset
 jsonlines
 tensorboard

+torch
 torchvision
 transformers
 tqdm
 webdataset
 jsonlines
 tensorboard
+matplotlib

save_roc.py ADDED Viewed

	@@ -0,0 +1,153 @@

+import os
+import torch
+from torch.utils.data import DataLoader
+from tqdm import tqdm
+import numpy as np
+import matplotlib.pyplot as plt
+import seaborn as sns
+from sklearn.metrics import roc_curve, auc, confusion_matrix, roc_auc_score
+from sklearn.preprocessing import label_binarize
+import json
+import sys
+sys.path.append('.')
+import pmc_clip
+from training.params import parse_args
+from training.data import PmcDataset
+from training.fusion_method import convert_model_to_cls
+# 标签映射
+LABEL_MAP = {
+    "Irritant dermatitis": 0,
+    "Allergic contact dermatitis": 1,
+    "Mechanical injury": 2,
+    "Folliculitis": 3,
+    "Fungal infection": 4,
+    "Skin hyperplasia": 5,
+    "Parastomal varices": 6,
+    "Urate crystals": 7,
+    "Cancerous metastasis": 8,
+    "Pyoderma gangrenosum": 9,
+    "Normal": 10
+}
+REVERSE_LABEL_MAP = {v: k for k, v in LABEL_MAP.items()}
+def main():
+    # 创建输出目录
+    output_dir = './evaluation_results_pmc_clip_cat'
+    if not os.path.exists(output_dir):
+        os.makedirs(output_dir)
+    # 设置设备
+    device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
+    print(f"使用设备: {device}")
+    # 加载模型配置
+    model_path = "logs/0321-Stoma-clip-train-cls/2025_03_21-23_45_18-model_RN50_fusion4-lr_1e-05-b_256-j_8-p_amp/checkpoints/epoch_150.pt"
+    model_name = "RN50_fusion4"
+    args = parse_args()
+    args.model = model_name
+    args.pretrained = model_path
+    args.device = device
+    args.mlm = True
+    args.train_data = "data/single_symptoms_test.jsonl"
+    args.image_dir = "./data/cleaned_data"
+    args.csv_img_key = "image"
+    args.csv_caption_key = "caption"
+    args.context_length = 77
+    args.num_classes = len(LABEL_MAP)
+    args.output_dir = output_dir
+    # 创建模型和预处理函数
+    model, _, preprocess = pmc_clip.create_model_and_transforms(args)
+    model = convert_model_to_cls(model, num_classes=args.num_classes, fusion_method='concat')
+    # 加载模型权重
+    state_dict = torch.load(model_path, map_location='cpu', weights_only=False)
+    state_dict_real = {}
+    for k, v in state_dict['state_dict'].items():
+        state_dict_real[k.replace("module.", "", 1)] = v
+    print(model.load_state_dict(state_dict_real))
+    model.to(device=device)
+    # 准备数据集
+    dataset = PmcDataset(args,
+                         input_filename=args.train_data,
+                         transforms=preprocess,
+                         is_train=False)
+    test_loader = DataLoader(dataset, batch_size=32, shuffle=False, num_workers=4)
+    print(f"测试集样本数: {len(dataset)}")
+    # 收集预测结果
+    all_preds = []
+    all_probs = []
+    all_labels = []
+    print("开始评估...")
+    model.eval()
+    with torch.no_grad():
+        for batch in tqdm(test_loader):
+            labels = batch["cls_label"].to(device)
+            # 前向传播
+            outputs = model(batch)
+            # 获取预测结果
+            probs = torch.softmax(outputs, dim=1)
+            _, preds = torch.max(outputs, dim=1)
+            all_preds.extend(preds.cpu().numpy())
+            all_probs.extend(probs.cpu().numpy())
+            all_labels.extend(labels.cpu().numpy())
+    # 转换为numpy数组
+    all_preds = np.array(all_preds)
+    all_probs = np.array(all_probs)
+    all_labels = np.array(all_labels)
+    # 计算整体AUC（使用one-vs-rest策略的平均）
+    try:
+        y_true_bin = label_binarize(all_labels, classes=range(args.num_classes))
+        if args.num_classes == 2:
+            overall_fpr, overall_tpr, _ = roc_curve(y_true_bin[:, 1], all_probs[:, 1])
+            overall_auc = roc_auc_score(y_true_bin, all_probs[:, 1])
+        else:
+            overall_fpr, overall_tpr, _ = roc_curve(y_true_bin.ravel(), all_probs.ravel())
+            overall_auc = roc_auc_score(y_true_bin, all_probs, multi_class='ovr', average='micro')
+    except Exception as e:
+        print(f"计算整体AUC时出错: {e}")
+        return
+    # 保存整体ROC曲线数据
+    roc_data = {
+        "fpr": overall_fpr.tolist(),
+        "tpr": overall_tpr.tolist(),
+        "auc": overall_auc
+    }
+    roc_file = os.path.join(output_dir, "overall_roc_data.json")
+    with open(roc_file, "w") as f:
+        json.dump(roc_data, f)
+    print(f"整体ROC曲线数据已保存至: {roc_file}")
+    # 绘制ROC曲线
+    plt.figure(figsize=(8, 6))
+    plt.plot(overall_fpr, overall_tpr, label=f"Overall (AUC = {overall_auc:.4f})")
+    plt.plot([0, 1], [0, 1], 'k--', label="Random Guess")
+    plt.xlim([0.0, 1.0])
+    plt.ylim([0.0, 1.05])
+    plt.xlabel('False Positive Rate (1 - Specificity)', fontsize=12)
+    plt.ylabel('True Positive Rate (Sensitivity)', fontsize=12)
+    plt.title('Overall ROC Curve', fontsize=14)
+    plt.legend(loc="lower right", fontsize=10)
+    plt.grid(alpha=0.3)
+    plt.tight_layout()
+    plt.savefig(os.path.join(output_dir, 'overall_roc_curve.png'), dpi=300, bbox_inches='tight')
+    print(f"整体ROC曲线图已保存至: {os.path.join(output_dir, 'overall_roc_curve.png')}")
+if __name__ == '__main__':
+    main()

setup.py ADDED Viewed

	@@ -0,0 +1,57 @@

+""" Setup
+"""
+from setuptools import setup, find_packages
+from codecs import open
+from os import path
+here = path.abspath(path.dirname(__file__))
+# Get the long description from the README file
+with open(path.join(here, 'README.md'), encoding='utf-8') as f:
+    long_description = f.read()
+exec(open('src/pmc_clip/version.py').read())
+setup(
+    name='pmc_clip',
+    version=__version__,
+    description='PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents',
+    long_description=long_description,
+    long_description_content_type='text/markdown',
+    url='https://github.com/WeixiongLin/PMC-CLIP/',
+    author='weixiong',
+    author_email='wx_lin@sjtu.edu.cn',
+    classifiers=[
+        # How mature is this project? Common values are
+        #   3 - Alpha
+        #   4 - Beta
+        #   5 - Production/Stable
+        'Development Status :: 3 - Beta',
+        'Intended Audience :: Education',
+        'Intended Audience :: Science/Research',
+        'License :: OSI Approved :: Apache Software License',
+        'Programming Language :: Python :: 3.7',
+        'Programming Language :: Python :: 3.8',
+        'Programming Language :: Python :: 3.9',
+        'Programming Language :: Python :: 3.10',
+        'Topic :: Scientific/Engineering',
+        'Topic :: Scientific/Engineering :: Artificial Intelligence',
+        'Topic :: Software Development',
+        'Topic :: Software Development :: Libraries',
+        'Topic :: Software Development :: Libraries :: Python Modules',
+    ],
+    # Note that this is a string of words separated by whitespace, not a list.
+    keywords='PMC-CLIP',
+    package_dir={'': 'src'},
+    packages=find_packages(where='src', exclude=['training']),
+    include_package_data=True,
+    install_requires=[
+        'torch >= 1.9',
+        'torchvision',
+        'transformers <= 4.21.0',
+        'ftfy',
+        'regex',
+        'tqdm',
+    ],
+    python_requires='>=3.7',
+)

stoma_clip.pt ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:9b1b2a21c2d8d66669f70cfc9380143d758c7937a45c2c8d06747860013c51ed
+size 832509306

vis_all_model_roc.py ADDED Viewed

	@@ -0,0 +1,71 @@

+import os
+import json
+import matplotlib.pyplot as plt
+def load_roc_data(roc_files):
+    """
+    加载多个模型的ROC数据
+    参数:
+        roc_files (list): 包含多个模型的overall_roc_data.json文件路径的列表
+    返回:
+        roc_data_list (list): 包含每个模型的ROC数据字典的列表
+    """
+    roc_data_list = []
+    for roc_file in roc_files:
+        with open(roc_file, "r") as f:
+            roc_data = json.load(f)
+            roc_data_list.append(roc_data)
+    return roc_data_list
+def plot_combined_roc(roc_data_list, model_names, output_path):
+    """
+    绘制多个模型的ROC曲线到同一张图中
+    参数:
+        roc_data_list (list): 包含每个模型的ROC数据字典的列表
+        model_names (list): 每个模型的名称列表
+        output_path (str): 保存ROC曲线图的路径
+    """
+    plt.figure(figsize=(10, 8))
+    for roc_data, model_name in zip(roc_data_list, model_names):
+        fpr = roc_data["fpr"]
+        tpr = roc_data["tpr"]
+        auc_value = roc_data["auc"]
+        plt.plot(fpr, tpr, label=f"{model_name} (AUC = {auc_value:.4f})")
+    # 绘制对角线
+    plt.plot([0, 1], [0, 1], 'k--', label="Random Guess")
+    # 图形设置
+    plt.xlim([0.0, 1.0])
+    plt.ylim([0.0, 1.05])
+    plt.xlabel('False Positive Rate (1 - Specificity)', fontsize=12)
+    plt.ylabel('True Positive Rate (Sensitivity)', fontsize=12)
+    plt.title('Combined ROC Curves for Multiple Models', fontsize=14)
+    plt.legend(loc="lower right", fontsize=10)
+    plt.grid(alpha=0.3)
+    plt.tight_layout()
+    # 保存图像
+    plt.savefig(output_path, dpi=300, bbox_inches='tight')
+    print(f"ROC曲线图已保存至: {output_path}")
+def main():
+    # 定义存放多个模型ROC数据的目录
+    roc_data_dir = "./roc_result"  # 替换为实际路径
+    output_path = "./combined_roc_curve.png"  # 保存最终ROC曲线图的路径
+    # 获取所有模型的overall_roc_data.json文件路径
+    roc_files = [os.path.join(roc_data_dir, f) for f in os.listdir(roc_data_dir)]
+    # 模型名称（从文件名提取）
+    model_names = [os.path.basename(f).replace(".json", "") for f in roc_files]
+    # 加载所有模型的ROC数据
+    roc_data_list = load_roc_data(roc_files)
+    # 绘制并保存组合ROC曲线
+    plot_combined_roc(roc_data_list, model_names, output_path)
+if __name__ == "__main__":
+    main()