3. July 2023 By Kevin Pahlke
An introduction to Stable Diffusion
Stable Diffusion AI is an exciting machine learning method used for generating high-quality images. Based on the idea that a neural network can learn a distribution of images by gradually adding noise to an image, the process of image generation is controlled by using what is known as a diffusion equation that gradually reduces the noise until the final image is achieved.
Prompts play a crucial role in this process by giving the network an idea of what it should generate. Prompts are usually short sentences or pictures that serve as input for the neural network. They help the network learn an accurate distribution of images by giving the network a target to reach.
In this blog post, I will look at creating prompts for Stable Diffusion AI. I will explain the basics of creating prompts, provide tips on how to create effective prompts and reveal the common mistakes people make when creating prompts.
Using effective prompts can improve the results of Stable Diffusion AI by helping the network to generate more accurate high-quality images.
Images created using Stable Diffusion AI are fully open source and explicitly included in the CC0 1.0 universal public domain dedication. Using the CC0 licence enables authors to make their works available to the public free of charge without third parties being able to restrict or prohibit their use, modification or distribution.
How to create basic prompts
Simple prompts can generate good results, but sometimes it is the details that make an image credible. The language in which the prompt is created can also affect the result of the image. There is a lot to learn in order to create a good prompt. But the most important thing is to describe the issue in as much detail as possible. Ensure that you use powerful keywords to define the style.
A good Stable Diffusion prompt should be as specific as possible. It makes sense to use specific prompts that fully describe the content of the desired image. It is also helpful to mention specific art styles or media. The Stable Diffusion algorithm can also be guided by naming specific artists.
Keyword weighting is an important step in creating a Stable Diffusion prompt. The most important keyword should be mentioned first, then the less important keywords can be listed afterwards. Synonyms can also be used to weight keywords. For example, if you want to create an image of a dog, you can use the term ‘dog’ as the most important keyword and then add synonyms – for example, puppy, pet or four-legged friend. It is important to learn a set of key words and their intended effect when your first start creating prompts. This is much like learning vocabulary in a new language.
One way to create high-quality images is to reuse existing prompts. There are a number of prompt libraries that you can simply copy and try out. The downside is that you may not understand why they produce high-quality images. Existing prompts can be used as a starting point that you can adapt to what you need. The following examples illustrate this point:
The difference between the two prompts is obvious. In the first example, the result might not be what was intended. In the second example, the result is already closer to what was expected. However, the second prompt can be improved to refine the result and produce stunning images.
Examples for prompts
This next section provides an overview of the type of images that can be created with Stable Diffusion AI. The following images were created with the free online version of Stable Diffusion AI.
wildlife photography, fox looking at you, photograph, high quality, wildlife, f 1.8, soft focus, 8k, national geographic, award-winning photograph by Nick Nichols
Archviz (abbreviation for architectural visualisation)
loft, home, steel, stone, interior, octane render, deviantart, cinematic, key art, hyperrealism, sun light, sunrays, 35 mm, 8k, medium-format print
house, small hill, old trees, small lake, 35 mm, realism, octane render, 8k, trending on artstation, unreal engine, hyper detailed, photo-realistic maximum detail, volumetric light, realistic matte painting, hyper photorealistic, trending on artstation, ultra-detailed, realistic
anthro, very cute kid’s film character alien, disney pixar zootopia character concept artwork, 3d concept, detailed fur, high detail iconic character for upcoming film, trending on artstation, character design, 3d artistic render, highly detailed, octane, blender, cartoon, shadows, lighting
close up of a grilled steak, depth of field. bokeh. soft light. by Yasmin Albatoul, Harry Fayt. centred. extremely detailed. Nikon D850, (35mm|50mm|85mm). award winning photography
Portrait of grandmother, photograph, highly detailed face, depth of field, moody light, golden hour, style by Dan Winters, Russell James, Steve McCurry, centred, extremely detailed, Nikon D850, award winning photography
Tips and tricks for creating prompts
You can include the following settings for Stable Diffusion in the prompts to improve the quality of the image:
The classifier guidance scale is a parameter that freely controls how strongly the model should follow the given prompt.
- 1 - The prompt is largely ignored.
- 3 - The model is a little more creative with the prompt.
- 7 - There is a good balance between following the prompt and the freedom to be creative.
- 15 - The model follows the prompt a little more closely.
- 30 - The prompt is strictly obeyed.
Increasing the number of steps improves quality. 20 steps with the Euler sampler are usually enough to produce a sharp, high-quality image. Although the image will still change subtly at higher sampling rates, it will change but not necessarily achieve a higher quality.
Recommendation: 20 steps. Increase the number of steps if you think the quality is not good enough.
These are simply different methods for solving diffusion equations. The following methods can be built into the prompt: Euler a, Euler, LMS, Heun, DP2M, DP2M a, DPM fast, LMS Karras, DPM2 Karass, DDIM, PLMS and the like
There are several discussions going on at the moment about which method is suitable for which style. The recommendation right now is to simply learn by doing – just try out different methods and see what happens.
Since Stable Diffusion is trained using 512 × 512 images, unexpected problems may occur when using portrait or landscape sizes. Always try to use a square format if you can.
Recommendation: Set the image size to 512 × 512.
The batch size indicates how many images are generated per process. Since the final result is highly dependent on the random factor, it is always a good idea to generate several images at the same time. This will help you get a good feel for what the current prompt you are using can achieve.
Recommendation: Set the batch size to 4 or 8.
Putting all this together might give you a prompt that looks like this: ‘batch-size: 4, steps: 20, sampler: Euler a, CFG scale: 7, ultra realistic alien-trooper, concept art, intricate details, highly detailed, photorealistic, octane render, 8k, unreal engine, sharp focus, volumetric lighting unreal engine. art by artgerm and alphonse mucha’
Stable Diffusion AI enables you to create stunning images for free and without needing to pay licence fees. A little practice is all it takes to achieve fantastic results.
You will find more exciting topics from the adesso world in our latest blog posts.
Why not check out some of our other interesting blog posts?