File size: 2,200 Bytes
f6fc41b
dcaaff0
f6fc41b
dcaaff0
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
---
license: mit
---

# Prompt2MedImage - Diffusion for Medical Images

Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset. 

The weights here are itended to be used with the 🧨Diffusers library. 

This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.

## Model Details
- **Developed by:** Nihir Chadderwala
- **Model type:** Diffusion based text to medical image generation model
- **Language:** English
- **License:** MiT
- **Model Description:** This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder ([CLIP ViT-L/14](https://arxiv.org/abs/2103.00020)) as suggested in the [Imagen paper](https://arxiv.org/abs/2205.11487). 


## License

This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.

- You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
- The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.
- You may re-distribute the weights and use the model commercially and/or as a service.


## Run using PyTorch

```bash
pip install diffusers transformers
```

Running pipeline with default PNDM scheduler:

```python
import torch
from diffusers import StableDiffusionPipeline

model_id = "Prompt2MedImage"
device = "cuda"

pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
pipe = pipe.to(device)

prompt = "Showing the subtrochanteric fracture in the porotic bone."
image = pipe(prompt).images[0]  
    
image.save("porotic_bone_fracture.png")
``` 

## Citation

```
O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich,
"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018.
doi: 10.1007/978-3-030-01364-6_20
```