Prompt2MedImage / README.md
Ubuntu
Submit model weights to HF
dcaaff0
metadata
license: mit

Prompt2MedImage - Diffusion for Medical Images

Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset.

The weights here are itended to be used with the 🧨Diffusers library.

This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.

Model Details

  • Developed by: Nihir Chadderwala
  • Model type: Diffusion based text to medical image generation model
  • Language: English
  • License: MiT
  • Model Description: This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder (CLIP ViT-L/14) as suggested in the Imagen paper.

License

This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.

  • You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
  • The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.
  • You may re-distribute the weights and use the model commercially and/or as a service.

Run using PyTorch

pip install diffusers transformers

Running pipeline with default PNDM scheduler:

import torch
from diffusers import StableDiffusionPipeline

model_id = "Prompt2MedImage"
device = "cuda"

pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)

prompt = "Showing the subtrochanteric fracture in the porotic bone."
image = pipe(prompt).images[0]  
    
image.save("porotic_bone_fracture.png")

Citation

O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich,
"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018.
doi: 10.1007/978-3-030-01364-6_20