Intro
ControlNet allows users to control the generation process of stable diffusion images with spatial contexts. One of the most common examples is using a “pose” - a stick figure drawing to control the position of a person in the output. But there are many other use cases.
You must have the ControlNet extension installed in Automatic1111.
Encode Input in Base64
Since we need to send an image to our server, we need to encode the image in base64. This example uses javascript:
const buf = fs.readFileSync('my-file.png');
const myBase64EncodedImage = Buffer.from(buf).toString('base64')
API Request
POST /sdapi/v1/txt2img
To use ControlNet, we use the same txt2img
endpoint, but we pass in extra params:
{
"prompt": "Sunset in the mountains, lake in front",
// other txt2img params
"alwayson_scripts": {
"controlnet": {
"args": [
{
"input_image": myBase64EncodedImage,
"model": "depth_fp16",
"module": "depth"
}
]
}
}
}
The input_image param should be encoded base64 string of your input image.
The module param is the name of the preprocessor to use.
The model param should be set to the name of a ControlNet model you have installed. For example, if your model name is called depth_fp16.safetensors
, the value should be depth_fp16
.
Response
See the txt2img guide for response examples
Full List of Params
Key | Description | Type | Default |
---|---|---|---|
model | Name of the installed model | string | none |
module | Name of the control type to use. One of | string | none |
input_image | Base64 encoded input image | string | none |
resize_mode | If the input image does not match the resolution of the output image, how should it handle it. Possible values:
| string | Just Resize |
weight | The weight of the control image | number | 1 |
control_mode | How much impact the control image should have on the generated image Possible values:
| string | Balanced |
pixel_perfect | Enables the Pixel Perfect preprocessor | boolean | |
mask | Base64 encoded mask image | string | none |
lowvram | Whether to compensate for low VRAM | boolean | |
processor_res | The resolution to use for the preprocessor | number | 64 |
threshold_a | The first preprocessor parameter, if applicable | number | 64 |
threshold_b | The second preprocessor parameter, if applicable | number | 64 |
guidance_start | Ratio of generation where ControlNet starts to have an effect | number | 0 |
guidance_end | Ratio of generation where ControlNet stops having an effect | number | 1 |