afmck/stable-diffusion-inpainting-segmentation · Apply for community grant: Personal project

This project implements the combination of the segmentation model facebook/detr-resnet-50-panoptic with Stable Diffusion fine-tuned on an inpainting task to aid in the quick masking of the input image – rather than manually drawing a mask. This can also be helpful when particularly fine-grained masking of an object is needed.

I think this demo would be helpful to the community as it would help speed up finetuning experiments and is a relatively involved Gradio example. Although the segmentation model runs quickly on CPU, diffusion is too slow to run on CPU.

This demo currently implements:

Extracting class-masks using facebook/detr-resnet-50-panoptic
Editing of masks via combining different masks, expanding mask region, denoising, and inversion.
Running of diffusion conditioned on resulting masked image, mask, and text prompt.
Advanced diffusion settings such as guidance scale and negative prompting.

What I still need to add:

Editing of mask using canvas tools. Currently blocked by this issue
Clicking object in image to toggle masks. See this issue. Current workaround is to use a CheckboxGroup.
Queueing and batching.

Currently the demo looks like this:

Thanks! Hope you like this demo! 🤗