raman07
/

SD-finetuned-MIMIC-attention

StableDiffusionPipeline

Inference Endpoints

Model card Files Files and versions Community

raman07 commited on Jan 31

Commit

9689983

•

1 Parent(s): 1b4d9f5

changes to model card

Files changed (1) hide show

README.md +36 -7

README.md CHANGED Viewed

@@ -28,10 +28,33 @@ This model can be directly used to generate realistic medical images from text p
 ## How to Get Started with the Model
 ```python
 from diffusers.pipelines import StableDiffusionPipeline
 pipe = StableDiffusionPipeline.from_pretrained(sd_folder_path, revision="fp16")
 ```
@@ -39,14 +62,11 @@ pipe = StableDiffusionPipeline.from_pretrained(sd_folder_path, revision="fp16")
 ### Training Data
-<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
 ### Training Procedure
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
 #### Metrics
@@ -54,12 +74,21 @@ This model has been evaluated using the Fréchet inception distance (FID) Score
 ### Results
-[More Information Needed]
 ## Environmental Impact
 ## Citation

 ## How to Get Started with the Model
 ```python
+import os
+from safetensors.torch import load_file
 from diffusers.pipelines import StableDiffusionPipeline
 pipe = StableDiffusionPipeline.from_pretrained(sd_folder_path, revision="fp16")
+exp_path = os.path.join('unet', 'diffusion_pytorch_model.safetensors')
+state_dict = load_file(exp_path)
+# Load the adapted U-Net
+pipe.unet.load_state_dict(state_dict, strict=False)
+pipe.to('cuda:0')
+# Generate images with text prompts
+TEXT_PROMPT = "No acute cardiopulmonary abnormality."
+GUIDANCE_SCALE = 4
+INFERENCE_STEPS = 75
+result_image = pipe(
+        prompt=TEXT_PROMPT,
+        height=224,
+        width=224,
+        guidance_scale=GUIDANCE_SCALE,
+        num_inference_steps=INFERENCE_STEPS,
+    )
+result_pil_image = result_image["images"][0]
 ```
 ### Training Data
+This model has been fine-tuned on 110K image-text pairs from the MIMIC dataset.
 ### Training Procedure
+The training procedure has been described in detail in Section 4.3 of this [paper](https://arxiv.org/abs/2305.08252).
 #### Metrics
 ### Results
+| Fine-Tuning Strategy   | FID Score |
+|------------------------|-----------|
+| Full FT                | 58.74     |
+| Attention              | 52.41     |
+| Bias                   | 20.81     |
+| Norm                   | 29.84     |
+| Bias+Norm+Attention    | 35.93     |
+| LoRA                   | 439.65    |
+| SV-Diff                | 23.59     |
+| DiffFit                | 42.5      |
 ## Environmental Impact
+Using Parameter-Efficient Fine-Tuning potentially causes **lesser** harm to the environment since we fine-tune a significantly lesser number of parameters in a model. This results in much lesser computing and hardware requirements.
 ## Citation