Are you interested in creating AI-generated art but don’t know where to start? This comprehensive guide will walk you through setting up and running Stable Diffusion from the command line, perfect for beginners who want to understand the fundamentals before moving to more complex interfaces.

Prerequisites

Before we begin, you’ll need:

  • A computer running Windows, Linux, or macOS
  • Basic familiarity with the command line
  • Either a compatible GPU (NVIDIA with 6GB+ VRAM) or a Thunder Compute account

Setting Up Your Environment

Option 1: Local Setup (With GPU)

  1. Install Python 3.10 or newer
  2. Install Git
  3. Install NVIDIA CUDA Toolkit (if using local GPU)
# Clone the repository
git clone https://github.com/CompVis/stable-diffusion
cd stable-diffusion

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Option 2: Using Thunder Compute (No GPU Required)

  1. Create an account at Thunder Compute
  2. Select the “Stable Diffusion CLI” template
  3. Launch your instance

This option lets you skip all driver installation and compatibility issues, getting straight to image generation.

Downloading the Model

For either setup method, you’ll need to download the model weights:

  1. Visit HuggingFace
  2. Accept the terms of use
  3. Download the model file

Place the downloaded model in the models directory of your Stable Diffusion installation.

Your First Generation

Let’s create your first AI-generated image:

python scripts/txt2img.py --prompt "a beautiful sunset over a calm ocean, hyperrealistic, 4k" --n_samples 1 --n_iter 1

This will create a single image based on your prompt. The output will be saved in the outputs directory.

Understanding Key Parameters

  • --prompt: Your text description
  • --n_samples: Number of images per generation
  • --n_iter: Number of generation iterations
  • --H and --W: Height and width of the output
  • --seed: Set a specific seed for reproducible results

Optimizing Your Results

Prompt Engineering

Good prompts are key to getting the results you want. Here’s a basic template:

[subject], [medium], [style], [artist], [additional details], [quality keywords]

Example:

python scripts/txt2img.py --prompt "a majestic dragon, digital art, fantasy style, by Greg Rutkowski, intricate details, epic lighting, 8k, masterpiece" --n_samples 1 --n_iter 1

Negative Prompts

Use negative prompts to specify what you don’t want in the image:

python scripts/txt2img.py --prompt "beautiful portrait of a woman" --negative_prompt "blurry, distorted, low quality" --n_samples 1 --n_iter 1

Troubleshooting Common Issues

Out of Memory Errors

If you encounter CUDA out of memory errors:

  1. Reduce image size (try 512x512)
  2. Lower the number of samples
  3. Consider using Thunder Compute for access to higher VRAM GPUs

Slow Generation

  • Update your NVIDIA drivers
  • Close other GPU-intensive applications
  • Consider using half-precision (--precision "fp16")

Next Steps

Once you’re comfortable with the CLI:

  • Experiment with different models
  • Try various sampling methods
  • Look into using LoRA models for style adaptation

Resources

Stay tuned for our next guide on setting up ComfyUI for a more user-friendly image generation experience!