raman07 commited on
Commit
f45a75a
·
verified ·
1 Parent(s): c61f290

model card

Browse files
Files changed (1) hide show
  1. README.md +113 -0
README.md ADDED
@@ -0,0 +1,113 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: diffusers
3
+ pipeline_tag: text-to-image
4
+ ---
5
+
6
+ ## Model Details
7
+
8
+ ### Model Description
9
+
10
+ This model is fine-tuned from [stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) on 110,000 image-text pairs from the MIMIC dataset using the Norm-tuning PEFT method. Under this fine-tuning strategy, fine-tune only the normalization weightsin the U-Net while keeping everything else frozen.
11
+
12
+ - **Developed by:** [Raman Dutt](https://twitter.com/RamanDutt4)
13
+ - **Shared by:** [Raman Dutt](https://twitter.com/RamanDutt4)
14
+ - **Model type:** [Stable Diffusion fine-tuned using Parameter-Efficient Fine-Tuning]
15
+ - **Finetuned from model:** [stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)
16
+
17
+ ### Model Sources
18
+
19
+
20
+ - **Paper:** [Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity](https://arxiv.org/abs/2305.08252)
21
+ - **Demo:** [MIMIC-SD-PEFT-Demo](https://huggingface.co/spaces/raman07/MIMIC-SD-Demo-Memory-Optimized?logs=container)
22
+
23
+ ## Direct Use
24
+
25
+ This model can be directly used to generate realistic medical images from text prompts.
26
+
27
+
28
+ ## How to Get Started with the Model
29
+
30
+ ```python
31
+ import os
32
+ from safetensors.torch import load_file
33
+ from diffusers.pipelines import StableDiffusionPipeline
34
+
35
+ pipe = StableDiffusionPipeline.from_pretrained(sd_folder_path, revision="fp16")
36
+ exp_path = os.path.join('unet', 'diffusion_pytorch_model.safetensors')
37
+ state_dict = load_file(exp_path)
38
+
39
+ # Load the adapted U-Net
40
+ pipe.unet.load_state_dict(state_dict, strict=False)
41
+ pipe.to('cuda:0')
42
+
43
+ # Generate images with text prompts
44
+
45
+ TEXT_PROMPT = "No acute cardiopulmonary abnormality."
46
+ GUIDANCE_SCALE = 4
47
+ INFERENCE_STEPS = 75
48
+
49
+ result_image = pipe(
50
+ prompt=TEXT_PROMPT,
51
+ height=224,
52
+ width=224,
53
+ guidance_scale=GUIDANCE_SCALE,
54
+ num_inference_steps=INFERENCE_STEPS,
55
+ )
56
+
57
+ result_pil_image = result_image["images"][0]
58
+ ```
59
+
60
+
61
+ ## Training Details
62
+
63
+ ### Training Data
64
+
65
+ This model has been fine-tuned on 110K image-text pairs from the MIMIC dataset.
66
+
67
+ ### Training Procedure
68
+
69
+ The training procedure has been described in detail in Section 4.3 of this [paper](https://arxiv.org/abs/2305.08252).
70
+
71
+ #### Metrics
72
+
73
+ This model has been evaluated using the Fréchet inception distance (FID) Score on MIMIC dataset.
74
+
75
+ ### Results
76
+
77
+ | Fine-Tuning Strategy | FID Score |
78
+ |------------------------|-----------|
79
+ | Full FT | 58.74 |
80
+ | Attention | 52.41 |
81
+ | Bias | 20.81 |
82
+ | Norm | 29.84 |
83
+ | Bias+Norm+Attention | 35.93 |
84
+ | LoRA | 439.65 |
85
+ | SV-Diff | 23.59 |
86
+ | DiffFit | 42.50 |
87
+
88
+
89
+ ## Environmental Impact
90
+
91
+ Using Parameter-Efficient Fine-Tuning potentially causes **lesser** harm to the environment since we fine-tune a significantly lesser number of parameters in a model. This results in much lesser computing and hardware requirements.
92
+
93
+ ## Citation
94
+
95
+
96
+ **BibTeX:**
97
+
98
+ @article{dutt2023parameter,
99
+ title={Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity},
100
+ author={Dutt, Raman and Ericsson, Linus and Sanchez, Pedro and Tsaftaris, Sotirios A and Hospedales, Timothy},
101
+ journal={arXiv preprint arXiv:2305.08252},
102
+ year={2023}
103
+ }
104
+
105
+ **APA:**
106
+ Dutt, R., Ericsson, L., Sanchez, P., Tsaftaris, S. A., & Hospedales, T. (2023). Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity. arXiv preprint arXiv:2305.08252.
107
+
108
+ ## Model Card Authors
109
+
110
+ Raman Dutt
111
+ [Twitter](https://twitter.com/RamanDutt4)
112
+ [LinkedIn](https://www.linkedin.com/in/raman-dutt/)
113
+ [Email](mailto:s2198939@ed.ac.uk)