SD3-ControlNet-Depth

Demo

import torch
from diffusers import StableDiffusion3ControlNetPipeline
from diffusers.models import SD3ControlNetModel, SD3MultiControlNetModel
from diffusers.utils import load_image

# load pipeline
controlnet = SD3ControlNetModel.from_pretrained("InstantX/SD3-Controlnet-Depth")
pipe = StableDiffusion3ControlNetPipeline.from_pretrained(
    "stabilityai/stable-diffusion-3-medium-diffusers",
    controlnet=controlnet
)
pipe.to("cuda", torch.float16)

# config
control_image = load_image("https://huggingface.co./InstantX/SD3-Controlnet-Depth/resolve/main/images/depth.jpeg")
prompt = "a panda cub, captured in a close-up, in forest, is perched on a tree trunk. good composition, Photography, the cub's ears, a fluffy black, are tucked behind its head, adding a touch of whimsy to its appearance. a lush tapestry of green leaves in the background. depth of field, National Geographic"
n_prompt = "bad hands, blurry, NSFW, nude, naked, porn, ugly, bad quality, worst quality"

# to reproduce result in our example
generator = torch.Generator(device="cpu").manual_seed(4000)
image = pipe(
    prompt, 
    negative_prompt=n_prompt, 
    control_image=control_image, 
    controlnet_conditioning_scale=0.5,
    guidance_scale=7.0,
    generator=generator
).images[0]
image.save('image.jpg')

Limitation

Due to the fact that only 1024*1024 pixel resolution was used during the training phase, the inference performs best at this size, with other sizes yielding suboptimal results.

Downloads last month
146
Inference API
Unable to determine this model’s pipeline type. Check the docs .