DIY AI Art: Running Stable Diffusion
Last week, I decided to explore Stable Diffusion models, including both video and image generation types. My experience with the video model wasn’t great; the videos generated from my images didn’t turn out well, showing incorrect movements or just panning and zooming. It’s possible I didn’t adjust the settings enough to get better results. However, I had much better success with the SDXL model, which is designed for generating images.
To use these models effectively, you need a computer with a GPU; otherwise, generating a single image could take several minutes. I started with a server on Google Cloud Platform (GCP). The first step was to increase my GPU quota from zero, which, thankfully, was a quick process.
Setting up on GCP was straightforward since it offers a prebuilt image for deep learning with all necessary Python libraries installed, which was perfect for my needs. I planned to use a popular GUI called ComfyUI for Stable Diffusion, so after launching the server, I ran the following commands to set everything up:
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt
pip install --pre torch torchvision torchaudio --index-url https://download.pytorch.org/whl/nightly/cu121
cd custom_nodes/
git clone https://github.com/ltdrdata/ComfyUI-Manager
python main.py --listen 0.0.0.0
Additionally, you must download and correctly place the models as detailed in this repository: https://github.com/SeargeDP/SeargeSDXL/blob/dev/README.md#direct-downloads. Not all the models are required just pick and choose what you need.
Once everything is setup, this allows you to access the Comfy UI by navigating to http://server_public_ip:port
, where you can start generating images by searching for and dropping workflows onto the UI.
I also tried running the models on my M1 Mac using ComfyUI, but it was too slow to be practical. However, I discovered a fantastic free app (a rarity for Mac apps) called Draw Things app, which runs these models efficiently. The app even offers self-hosted versions of many models, making setup a breeze.
For anyone navigating through the myriad settings and options, I found an invaluable resource in civitai.com . This website is a treasure trove of images, models, and tools like LoRA. Here’s a practical example of how I used it:
- First, I searched for the SDXL model on civit.ai and scrolled down to the Gallery section.
- Clicking on an image, I found at https://civitai.com/images/8829848, showed me all the resources and prompts used for that image. Interestingly, this image also utilized a LoRA, which you can use too.
- To use the same LoRA, click on the LoRA link provided with the image. It directs you to a page from where you can copy the LoRA download link.
- In the Draw Things app, go to the LoRA section, click on manage, and then import the LoRA using the copied link.
This is what the generated image looked like for me (uploaded reduced resolution here):
Although generating high-resolution images or using multiple models can be slow, it’s still rewarding to see your local machine create something unique, especially if you have a spare machine to dedicate to the task, as I do.