Deploying the Prebuilt Img2Img Model with Cerebrium: A Step-by-Step Guide

A comprehensive step-by-step guide to deploying the prebuilt Img2Img model using Cerebrium.

Deploying the Prebuilt Img2Img Model with Cerebrium: A Step-by-Step Guide
A complete guide to deploying the Cerebrium Img2Img model

Today we're going to dive into the world of image generation using prebuilt models with Cerebrium. The purpose of this guide is to provide a comprehensive step-by-step guide to deploying the prebuilt Img2Img model using Cerebrium.

Cerebrium is a machine learning framework that simplifies training, deploying, and monitoring of Machine learning models with just a few lines of code. Whether you're a seasoned machine learning practitioner or a curious enthusiast, you'll find Cerebrium's power to deploy models like Img2Img, a game-changer.

Background on Prebuilt Img2Img Models

Img2Img models can generate images from an input image, providing numerous possibilities. The one we're deploying today is the Image-to-Image text-guided generation with Stable Diffusion, a prebuilt model offered by Cerebrium. This model allows us to give a text prompt and an initial image to guide the generation of new images. Let's see how we can do this with Cerebrium.

Step-by-Step Deployment Guide

Obtain API Key and Credentials from Cerebrium

Sign up on Cerebrium and get your API key from your dashboard. Treat this key as your passport to the world of Cerebrium.

Set up the Cerebrium Environment

Install the Cerebrium Python library and set up your environment as per the instructions provided in the Cerebrium documentation. This is your ticket to the Cerebrium universe.

Deploy the Prebuilt Img2Img Model

You can deploy the model directly from your dashboard or via the Cerebrium Python framework. If using python, just specify the identifier "sd-img-2-img" to indicate you want to deploy the Img2Img model.

Here's a Python snippet to deploy the model:

from cerebrium import deploy, model_type
endpoint = deploy((model_type.PREBUILT, "sd-img-2-img"), 'my-flow', "<YOUR_API_KEY>")

Replace <YOUR_API_KEY> with your actual API key.

Prepare Input Data: Prompt and Initial Image

The Img2Img model operates using two primary inputs: a text prompt and an initial image.

  • Prompt: The text prompt essentially guides the image generation process. For example, if your prompt is "a photo of a sunset over the ocean", the model will generate images that, to the best of its ability, visually represent this description. Your choice of words in the prompt will significantly influence the final output.
  • Initial Image: This input is optional, but it can be extremely useful for tasks like style transfer or to guide the generated images' context. If provided, the initial image should be a base64 encoded string of your image. The model will use this image as a starting point and transform it guided by the text prompt.

Configure Deployment Parameters: Inference Steps, Guidance Scale, Variations

To customize your image generation process, you have several tunable parameters at your disposal:

  • Num_Inference_Steps: The number of steps you'd like the model to take during the image generation process. The results typically improve with more steps, but it also increases the generation time. It defaults to 50 steps and is an optional parameter. Experiment with this to find a balance between output quality and generation speed that works for you.
  • Guidance_Scale: This is an optional parameter, defaulting to 7.5. It influences how strongly the text prompt guides the image generation process. A higher guidance scale will force the generated images to align more closely with the prompt, potentially at the cost of image quality or diversity.
  • Num_Images_Per_Prompt: An optional parameter that defaults to 1. It represents the number of image variations you'd like the model to generate per prompt. If you want to see multiple interpretations of your prompt, this is the parameter to adjust.

There are also a few more optional parameters you could play around with:

  • Negative_Prompt: Here's where you can specify what you don't want in your images. For instance, if you want an image of a landscape without any animals, you can put "animals" in the negative prompt.
  • Seed: If you want to ensure that image generations are the same for identical prompts and images, set this parameter to a fixed integer.
  • File_URL: For larger images, instead of a base64 encoded string, you can provide a URL where Cerebrium can fetch your initial image.
  • Webhook_Endpoint: Specify the URL where Cerebrium can send a response confirming the generated images are ready.

These parameters give you control over the image generation process, allowing you to customize it to your requirements. Remember, experimenting with these values is part of the fun and could lead to surprising and exciting results!

Make the Deployment Request to Cerebrium API

You can interact with the deployed model by sending a POST request to the endpoint provided at deployment. Make sure to include your API key in the request header for authentication. Here's an example request from the Cerebrium docs:

curl --location --request POST 'https://run.cerebrium.ai/sd-img-2-img-webhook/predict' \
--header 'Authorization: <API_KEY>' \
--header 'Content-Type: application/json' \
--data-raw '{
"prompt": "a photo of an astronaut riding a horse on mars.",
"image": "<BASE_64_STRING>"
"num_inference_steps": 50,
"guidance_scale": 7.5,
"num_images_per_prompt": 1,
"negative_prompt": "",
"seed": 120,
"file_url": "https://your-url.com/image.png"
"webhook_endpoint": "https://your-url.com"
}'

And here's what an example response looks like:

{
"run_id": "<UUID_STRING>",
"message": "Successfully generated image",
"result": [<BASE_64_STRING>]
}

Monitor Deployment Progress and Retrieve the Generated Images

Upon successful image generation, Cerebrium sends the images to the webhook endpoint provided. Also, the response includes the generated images as base64 encoded strings. You can read more about monitoring the performance of the Img2Img prebuilt model and enhancing it here.

Conclusion

We've explored how to deploy a prebuilt Img2Img model using Cerebrium. You've seen step-by-step how to obtain your API key to deploy the model, set the parameters, retrieve the generated images, and monitor performance.

I encourage you to experiment with these deployed models. Explore the realms of image generation and uncover novel applications in your field.
If you need help, the Cerebrium documentation is extensive that can help you. You can also check out our other articles for further reading on how to shape your prompts and how to monitor performance.

Note: This article originally appeared on the Cerebrium blog.

Subscribe or follow me on Twitter for more content like this!