raman07
/

SD-finetuned-MIMIC-bias

+---
+library_name: diffusers
+pipeline_tag: text-to-image
+language:
+- en
+---
+## Model Details
+### Model Description
+This model is fine-tuned from [stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5) on 110,000 image-text pairs from the MIMIC dataset using the Bias tuning PEFT method. Under this fine-tuning strategy, fine-tune only the bias weights in the U-Net while keeping everything else frozen.
+- **Developed by:** [Raman Dutt](https://twitter.com/RamanDutt4)
+- **Shared by:** [Raman Dutt](https://twitter.com/RamanDutt4)
+- **Model type:** [Stable Diffusion fine-tuned using Parameter-Efficient Fine-Tuning]
+- **Finetuned from model:** [stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)
+### Model Sources
+- **Paper:** [Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity](https://arxiv.org/abs/2305.08252)
+- **Demo:** [MIMIC-SD-PEFT-Demo](https://huggingface.co/spaces/raman07/MIMIC-SD-Demo-Memory-Optimized?logs=container)
+## Direct Use
+This model can be directly used to generate realistic medical images from text prompts.
+## How to Get Started with the Model
+```python
+import os
+from safetensors.torch import load_file
+from diffusers.pipelines import StableDiffusionPipeline
+pipe = StableDiffusionPipeline.from_pretrained(sd_folder_path, revision="fp16")
+exp_path = os.path.join('unet', 'diffusion_pytorch_model.safetensors')
+state_dict = load_file(exp_path)
+# Load the adapted U-Net
+pipe.unet.load_state_dict(state_dict, strict=False)
+pipe.to('cuda:0')
+# Generate images with text prompts
+TEXT_PROMPT = "No acute cardiopulmonary abnormality."
+GUIDANCE_SCALE = 4
+INFERENCE_STEPS = 75
+result_image = pipe(
+        prompt=TEXT_PROMPT,
+        height=224,
+        width=224,
+        guidance_scale=GUIDANCE_SCALE,
+        num_inference_steps=INFERENCE_STEPS,
+    )
+result_pil_image = result_image["images"][0]
+```
+## Training Details
+### Training Data
+This model has been fine-tuned on 110K image-text pairs from the MIMIC dataset.
+### Training Procedure
+The training procedure has been described in detail in Section 4.3 of this [paper](https://arxiv.org/abs/2305.08252).
+#### Metrics
+This model has been evaluated using the Fréchet inception distance (FID) Score on MIMIC dataset.
+### Results
+| Fine-Tuning Strategy   | FID Score |
+|------------------------|-----------|
+| Full FT                | 58.74     |
+| Attention              | 52.41     |
+| Bias                   | 20.81     |
+| Norm                   | 29.84     |
+| Bias+Norm+Attention    | 35.93     |
+| LoRA                   | 439.65    |
+| SV-Diff                | 23.59     |
+| DiffFit                | 42.50     |
+## Environmental Impact
+Using Parameter-Efficient Fine-Tuning potentially causes **lesser** harm to the environment since we fine-tune a significantly lesser number of parameters in a model. This results in much lesser computing and hardware requirements.
+## Citation
+**BibTeX:**
+@article{dutt2023parameter,
+  title={Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity},
+  author={Dutt, Raman and Ericsson, Linus and Sanchez, Pedro and Tsaftaris, Sotirios A and Hospedales, Timothy},
+  journal={arXiv preprint arXiv:2305.08252},
+  year={2023}
+}
+**APA:**
+Dutt, R., Ericsson, L., Sanchez, P., Tsaftaris, S. A., & Hospedales, T. (2023). Parameter-Efficient Fine-Tuning for Medical Image Analysis: The Missed Opportunity. arXiv preprint arXiv:2305.08252.
+## Model Card Authors
+Raman Dutt
+[Twitter](https://twitter.com/RamanDutt4)
+[LinkedIn](https://www.linkedin.com/in/raman-dutt/)
+[Email](mailto:s2198939@ed.ac.uk)