Apply for community grant: Personal project

#1
by afmck - opened

This project implements the combination of the segmentation model facebook/detr-resnet-50-panoptic with Stable Diffusion fine-tuned on an inpainting task to aid in the quick masking of the input image โ€“ rather than manually drawing a mask. This can also be helpful when particularly fine-grained masking of an object is needed.

I think this demo would be helpful to the community as it would help speed up finetuning experiments and is a relatively involved Gradio example. Although the segmentation model runs quickly on CPU, diffusion is too slow to run on CPU.

This demo currently implements:

  • Extracting class-masks using facebook/detr-resnet-50-panoptic
  • Editing of masks via combining different masks, expanding mask region, denoising, and inversion.
  • Running of diffusion conditioned on resulting masked image, mask, and text prompt.
  • Advanced diffusion settings such as guidance scale and negative prompting.

What I still need to add:

  • Editing of mask using canvas tools. Currently blocked by this issue
  • Clicking object in image to toggle masks. See this issue. Current workaround is to use a CheckboxGroup.
  • Queueing and batching.

Currently the demo looks like this:
image.png

Thanks! Hope you like this demo! ๐Ÿค—

Hi! Just wanted to bump this.

If this is not in scope for a grant, I will probably change to a GPU plan myself. I have one question about that: are you charged only for times when someone is using the demo, or is it a flat rate per hour? i.e: even if no one uses the demo, is it 24*$0.6 per day? Or is it just $0.6 times the number of hours of use?

Thanks!

Sign up or log in to comment