AI in the kitchen

Insights

AI in the kitchen

Designing with Stable Diffusion and ControlNet

By

Doug Cook

17

Aug

2023

We recently experimented with ControlNet, an extension to Stable Diffusion, the popular, open source text-to-image tool developed by Stability AI.

ControlNet is a neural network structure that provides control over diffusion models in image and video creation. It addresses the problem of spatial consistency by providing a way to specify which parts of an image should be preserved, giving designers more control over the composition of AI-generated images.

With ControlNet, you can extract things like depth maps, poses, and edge lines from an image to inform new generations, avoiding random compositions or the need to rely on a seed. This process goes by a few different names, but is commonly referred to as annotation or conditioning. From a workflow perspective, it's just a form of preprocessing that takes place before generation.

4 step process of uploading image, edge detection, adding a text prompt, going through stable diffusion, and getting an output image

To test this approach, we used ControlNet to extract depth and edge lines from a photo of our studio kitchen to create a template, called a control map, for generating new designs.

Combined with a text prompt, we created a variety of different designs and treatments based on our original photo. While each is stylistically different, the architectural elements of our kitchen are maintained between designs.

15 different kitchens

In fact, you can create an almost infinite number of designs with a single control map, using basic prompts to guide each exploration. Below are two quick outtakes, along with the prompts we used to achieve each design.


lorem

But this is just one example of what's possible with ControlNet. Imagine creating control maps for an entire digital product or domain. At first glance, this might look like a stylesheet for a set of photographic assets. But the use cases actually go far beyond that. Below are just a few:

Interior and architectural design


With increased fidelity and more mature application of materials and finishes, interior designers could quickly conceptualize new interior designs based on an existing structure or even a sketched design.

Product photography and photo shoots

Object selection, pose detection, and more make it easy to experiment with colors, environments, and textures. Product mockups, marketing collateral, and photo shoots could all be achieved without complex photography, rigs, or lighting.

Digital product customization

By customizing a product's appearance, imagery, and content based on individual user preferences, technologies like ControlNet could provide more personalized experiences. Dynamic adjustments to the look and feel could help increase user satisfaction and engagement, fostering a stronger connection between products and their users.

Characters and environments


Imagine a game that can generate different 3D environments, lighting, and caustics, all based on the same characters or level design. In-game customizations, marketing, and branding could all be satisfied on-the-fly without having to re-render each design.

Augmented and virtual reality

By mapping digital content onto virtual or real-world environments, ControlNet could enhance immersive experiences, enabling more interactive and realistic AR/VR experiences and expanding the ways in which digital products can be used.

Internationalization and localization

Abstracting common styles from bitmap images could also provide unique opportunities to internationalize or localize individual assets and/or images in a digital product. Customization based on locale could improve customer response while providing greater personalization.

For more information on ControlNet, see the original research by Lvmin Zhang and Maneesh Agrawala at Stanford University. Their project is also actively maintained on Github.

Have an idea or interested in learning more? Feel free to reach out to us on Instagram or Twitter!

Doug Cook

Doug Cook

FOUNDER AND PRINCIPAL

Doug is the founder of thirteen23. When he’s not providing strategic creative leadership on our client engagements, he can be found practicing the time-honored art of getting out of the way.

Around the studio