Inpaint Patching

#10
by rashidvyro - opened

Hey @OzzyGT ! Is there a possibility that i can apply some sort of patching in the input image i.e crop the masked region from the original image, inpaint that, stitch it back to the original image. This would help me compute only the masked area that would be much lesser than the entire image and also increase inference time.

Hi, there's no limitation with what you say, it can be done but sadly I won't have the time to do it anytime soon since this is just a PoC for the guide I made. Probably the next time I iterate over this will be to do a full outpainting/inpainting one with higher quality (slower).

Also take note that this space uses 1024px square images which is the standard and recommended size for SDXL and also uses a lighting model with a low step count and I disable the CFG after the second step, if you lower the resolution with the method you're suggesting probably it won't make that much difference in inference time but it will impact the quality of the final inpaint.

The only reason I would use what you're suggesting would be to add better quality/details to the generation, like cropping the masked part and upscale it to 1024px or to be able to work with higher resolution source images.

Thanks a ton for getting back to me with such a detailed explanation! I really appreciate you taking the time to break things down.

I totally get that adding patch-based inpainting would take some effort, especially since you’re focusing on the 1024px setup and the current model configurations. Your insights on the balance between speed and quality make a lot of sense.

I do have a few technical questions that might help me work within the current framework:

Handling Input Sizes: When cropping and inpainting smaller regions, should I upscale the patches back to 1024px to match the model’s requirements, or is there a more efficient way to adjust for different input sizes?

Seamless Integration: What’s the best way to stitch the inpainted regions back into the original image without noticeable seams? Would simple blending techniques work, or do you recommend something more advanced?

Performance Tweaks: Are there specific model settings or optimizations in SDXL that I could adjust to better support selective inpainting without slowing down the inference too much?

Thanks again for all your hard work on this PoC. Looking forward to your thoughts!

Handling Input Sizes: When cropping and inpainting smaller regions, should I upscale the patches back to 1024px to match the model’s requirements, or is there a more efficient way to adjust for different input sizes?

The way I'd do it is to just make a square with the mask (find the upper left and lower right points and then expand to a square) and upscale it with a model to 1024px, this way you will get the best quality but it will be a lot slower. You can just upscale though, it should work ok. SDXL works the best with 1024px but you can still use lower resolutions or different aspect ratios though so it all depends on the quality you want.

What’s the best way to stitch the inpainted regions back into the original image without noticeable seams? Would simple blending techniques work, or do you recommend something more advanced?

This techniques does the inpainting/outpainting really good, you can't see the seams in almost all the images, I notices there's some types that are harder though. For example for anime/drawings it works a lot better if you change the model to one finetuned for this, also real grainy photos or images that had some kind of artifacts/low quality also don't work really well. This also doesn't work if you change some of the params or even the scheduler, you will start seeing the seams.

What I would recommend for getting rid of the seams is to use differential-diffusion that allows to use soft mask, so you can use a gradient mask on the borders and it will merge really good, but this will make everything slower., traditional blending techniques could work but won't be perfect. You can also sometimes fix the blending with histogram matching when it's just a difference with the colors like when you use drawings with this model.

Also if you read my guide, when I have time I'll do a proper outpainting guide where I will use all of this, also this step for me is something to generate the prefill of the final images, so this step or space would be just to replace something like lama or opencv for filling the image before the last step which is the real outpainting.

Performance Tweaks: Are there specific model settings or optimizations in SDXL that I could adjust to better support selective inpainting without slowing down the inference too much?

As far as I know, what I did here is the fastest method of doing this without lowering the quality, I tried other techniques like torch.compile, no cfg, ays, pag, etc and all lowered the quality or made inference slower. There's probably something I'm missing that can make it faster but I don't really know right now.

Sign up or log in to comment