Prompt2MedImage / README.md
Nihirc's picture
Update README.md
fce97a7
|
raw
history blame
2.28 kB
---
license: wtfpl
datasets:
- MedIR/roco
language:
- en
pipeline_tag: text-to-image
---
# Prompt2MedImage - Diffusion for Medical Images
Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset.
The weights here are itended to be used with the 🧨Diffusers library.
This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.
## Model Details
- **Developed by:** Nihir Chadderwala
- **Model type:** Diffusion based text to medical image generation model
- **Language:** English
- **License:** wtfpl
- **Model Description:** This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder ([CLIP ViT-L/14](https://arxiv.org/abs/2103.00020)) as suggested in the [Imagen paper](https://arxiv.org/abs/2205.11487).
## License
This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.
- You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
- The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.
- You may re-distribute the weights and use the model commercially and/or as a service.
## Run using PyTorch
```bash
pip install diffusers transformers
```
Running pipeline with default PNDM scheduler:
```python
import torch
from diffusers import StableDiffusionPipeline
model_id = "Nihirc/Prompt2MedImage"
device = "cuda"
pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)
prompt = "Showing the subtrochanteric fracture in the porotic bone."
image = pipe(prompt).images[0]
image.save("porotic_bone_fracture.png")
```
## Citation
```
O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich,
"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018.
doi: 10.1007/978-3-030-01364-6_20
```