Nihirc
/

Prompt2MedImage

StableDiffusionPipeline

Model card Files Files and versions Community

Prompt2MedImage / README.md

Ubuntu

Submit model weights to HF

dcaaff0 about 2 years ago

|

2.2 kB

	---
	license: mit
	---

	# Prompt2MedImage - Diffusion for Medical Images

	Prompt2MedImage is a latent text to image diffusion model that has been fine-tuned on medical images from ROCO dataset.

	The weights here are itended to be used with the 🧨Diffusers library.

	This model was trained using Amazon SageMaker and the Hugging Face Deep Learning container.

	## Model Details
	- Developed by: Nihir Chadderwala
	- Model type: Diffusion based text to medical image generation model
	- Language: English
	- License: MiT
	- Model Description: This latent text to image diffusion model can be used to generate high quality medical images based on text prompts. It uses a fixed, pretrained text encoder ([CLIP ViT-L/14](https://arxiv.org/abs/2103.00020)) as suggested in the [Imagen paper](https://arxiv.org/abs/2205.11487).


	## License

	This model is open access and available to all, with a Do What the F*ck You want to public license further specifying rights and usage.

	- You can't use the model to deliberately produce nor share illegal or harmful outputs or content.
	- The author claims no rights on the outputs you generate, you are free to use them and are accountable for their use.
	- You may re-distribute the weights and use the model commercially and/or as a service.


	## Run using PyTorch

	```bash
	pip install diffusers transformers
	```

	Running pipeline with default PNDM scheduler:

	```python
	import torch
	from diffusers import StableDiffusionPipeline

	model_id = "Prompt2MedImage"
	device = "cuda"

	pipe = StableDiffusionPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
	pipe = pipe.to(device)

	prompt = "Showing the subtrochanteric fracture in the porotic bone."
	image = pipe(prompt).images[0]

	image.save("porotic_bone_fracture.png")
	```

	## Citation

	```
	O. Pelka, S. Koitka, J. Rückert, F. Nensa, C.M. Friedrich,
	"Radiology Objects in COntext (ROCO): A Multimodal Image Dataset".
	MICCAI Workshop on Large-scale Annotation of Biomedical Data and Expert Label Synthesis (LABELS) 2018, September 16, 2018, Granada, Spain. Lecture Notes on Computer Science (LNCS), vol. 11043, pp. 180-189, Springer Cham, 2018.
	doi: 10.1007/978-3-030-01364-6_20
	```