Image Generation terms
Short definitions that will help you get better results from tools like GetIMG, Midjourney, StableDiffusion, or Flux AI.
The text instruction that guides image generation. Strong prompts include subject, style, lighting, and emotion.
A list of things you don’t want the AI to include (e.g., blurry, extra limbs, deformed face, multiple fingers, etc.).
A numerical value that controls randomness. Same prompt + same seed = same image.
Controls how strictly the AI sticks to your prompt. Higher CFG = closer to your exact instructions.
Number of refinement cycles during generation. More steps = more detail (but slower).
A type of AI model that generates images by gradually denoising a random input. Examples include Stable Diffusion and DALL·E.
The algorithm used to generate the image from noise. Examples include: Euler, DPM++ 2M, and DDIM. Each produces unique image characteristics.
A compressed mathematical representation of image features that the model manipulates during generation.
A saved version of a trained model. Different checkpoints produce different image styles or capabilities.
A method of fine-tuning AI models with minimal data. Popular in customizing Stable Diffusion models for specific styles or characters.
A component in many diffusion models that compresses and reconstructs image data. Impacts output sharpness and color fidelity.
The process of regenerating or replacing specific areas of an image based on a new prompt. Useful for editing or erasing.
Extending an existing image beyond its borders using AI, while preserving its original style and context.
A powerful extension for Stable Diffusion that allows precise control over generation using guides like depth maps, poses, or edge detection.
Generating seamless, repeating images that can be used as textures or patterns.
The number of images generated per prompt submission. Larger batch sizes increase generation time but offer more variation.
A now less commonly used architecture for image generation, consisting of a generator and discriminator. For a detailed explanation of how a GAN pipeline works, see the GAN Pipeline entry below, on the Video Generation Terms.
Uses a CLIP model to evaluate how well the generated image matches the text prompt. Improves coherence between prompt and output.
An AI model that enhances the resolution of an image without loss of detail. Often used post-generation for production-quality results.
The width-to-height ratio of the generated image. Customizable to fit social media, posters, wallpapers, etc.
The technique of emphasizing parts of a prompt using syntax like ((word)) or [word:strength] to influence the image outcome.