noni27/Shift_and_Inpaint · Apply for community grant: Personal project (gpu)

Task : Given an image and a text prompt (specifying class of an object), shift the location of that object in image plane by x and y units.

My Solution:

Pass input image and text prompt through GroundingDINO model to get bounding box around the object specified in text prompt.
Use the bounding box to get segmentation mask using SegmentAnything model.
Remove the segmented object.
Use StableDiffusion Inpainting model to inpaint the hole created after removing object.
Shift the segmented object in image place using opencv affine transformation and combine it with the inpainted image.
And wolla !!, you have shifted the object specified in text prompt.

Requirement : I need small GPU, to load groundingDINO, SAM, StableDiffusion model and run inference on single (image, text_prompt) at a time.

Regards,
Naman Jaswani