Stable Diffusion Findings

Here are some things I've found out by simply experimenting with stable diffusion. I might easily be wrong about a lot of this, as I've just started toying around with it very recently.

CFG Scale

The "classifier free guidance scale" seems to need higher values when doing "image-to-image" than when doing "text-to-image".

Image Resolution Issues

Even when doing "image-to'image" you can sometimes see repeated patterns in the generated output because the original training set was all 512x512 pixel images, and this causes repeated patterns after 512 pixels. This appears to get slightly worse at high CFG scale values.

Random Seed

When going creating an image using "text-to-image" and then sending that to "image-to-image", it doesn't seem to matter whether you use the same random seed for both processes or not. The original random seed doesn't seem to be more "true" to the original image than any other randon number. this needs further validation though.

Ethic Concerns

So basically, the "bot" has been looking at a huge number of images. Imagine someone who visited a lot of art museums and browsed the internet a lot. Because it's a robot, it can reproduce what it has seen with a precision that only very few human beings can match. And because of this ability it has the ability to emulate the look and style of many known artists. By using (abusing?) this ability, it's possible for anyone using it to "steal the soul" of those artists, which is why should probably shouldn't do so.

As always, when a new technology is invented, it's here to stay – and it's up to us humans to figure out what's acceptable behavior, and what's not. And as of writing, the lines are somewhat blurry and it's a controversial and touchy topic, so to anyone toying around with this: Have fun, but remember to think of the bigger picture and the artists affected.

Website by Joachim Michaelis