How to Write a Prompt for AI Image Generation

Crafting a powerful prompt for models like Stable Diffusion (SD) and Flux is less about dialing in arcane parameters and more about telling a vivid, concise visual story. A great prompt balances five building blocks—subject, context, style, lighting & composition, and mood/intent—then sharpens the result with iterative refinement or negative phrasing. Master these pieces and you can steer the AI toward anything from a cinematic portrait to a whimsical product mock‑up without touching a single CFG scale.

Introduction

I spent my first months with SD churning out fuzzy landscapes and bizarre portraits before realizing the issue wasn’t the model—it was my wording. Once I began treating each prompt like a mini art‑director’s brief, the results clicked: clearer faces, richer textures, and scenes that actually matched my imagination. Below is the step‑by‑step framework I now rely on.

1. Why Prompts Matter

  • Language is the primary control surface. SD and Flux tokenize every word and weigh it against billions of learned image‑text pairs, so clarity and order matter enormously.
  • Descriptive detail > technical jargon. Instead of “CFG 7, Euler,” describe what you see: “soft dawn light over misty pines.”

2. Anatomy of a Prompt

Part Questions to Answer Quick Example
Subject Who/what is the focus? “Golden retriever puppy”
Context/Action Doing what, where? “splashes through a forest stream”
Style/Medium Visual treatment? “watercolor illustration, impressionist”
Lighting & Composition How is it framed/lit? “backlit by low morning sun, rule‑of‑thirds”
Mood/Intent Emotional tone? “joyful, serene”

(The checklist adapts the eight‑category breakdown from SD community tutorials.)

Negative Prompts

Write what you don’t want—e.g., “no text, no watermarks, no extra limbs”—to filter artifacts or unwanted motifs.

Hierarchical & Narrative Layers

For longer stories, arrange ideas from broad to narrow: first scene, then focal object, then fine details. This mirrors research on hierarchical prompt learning for diffusion models.

3. Core Descriptors & Adjectives

Style & Medium

Curate art‑movement or camera keywords: “cubist,” “ink wash,” “isometric pixel art.” Community cheat sheets list hundreds of artist‑inspired styles if you get stuck.

Lighting & Mood

Swap dry numbers for cinematic phrases: “neon rim‑light,” “golden‑hour glow,” “god‑rays piercing fog.”

Lens & Composition

Even without camera parameters you can evoke them: “telephoto‑like compression,” “wide‑angle intimacy,” or simply “framed on the golden spiral.”

Texture & Detail

Words such as “velvety,” “grainy,” or “micro‑textured leather” hint at surface quality. Reddit prompt swaps often inspire fresh adjectives.

4. Category‑by‑Category Examples

Below, each sample contains only descriptive text—no parameters.

Portrait (Cinematic)

Prompt: “Close‑up portrait of an elderly jazz trumpeter, deep laugh lines, eyes closed mid‑solo, dramatic rim‑light against a smoky backstage curtain, rich Rembrandt shadows, mood of nostalgic triumph.”

Why it works: clear subject, action, lighting style, and mood; perfect for SD’s portrait checkpoints.

Landscape (Epic Fantasy)

Prompt: “Vast alpine valley at dawn, snow‑dusted peaks reflecting in a still glacial lake, lone wooden cabin with lantern light in window, painterly brushstrokes reminiscent of Romantic oil works, hushed awe.”

The narrative framing echoes Adobe’s scenic prompt tutorial.

Product Shot (E‑commerce)

Prompt: “Minimalist overhead shot of a matte‑black smartwatch on marble slab, diffused softbox lighting, subtle shadow for depth, high‑contrast yet clean, evokes premium tech aesthetic.”

Good composition keywords plus negative: “no reflections, no logos.”

Concept Art (Sci‑Fi)

Prompt: “Colossal orbital shipyard above a crimson gas giant, clusters of half‑built starships lit by sparking welding drones, cinematic teal‑orange palette, impression of scale and industry.”

Storytelling focus aids Flux, which thrives on fuller sentences.

Street/Documentary

Prompt: “Night market alleyway in Taipei, rain‑slick cobblestones reflecting neon signage, vendors grilling skewers, steam rising, candid reportage feel, handheld framing, vibrant yet gritty.”

Adding “handheld” hints at lens behavior without specifying focal lengths.

Illustration (Children’s Book)

Prompt: “Whimsical forest creatures having a midnight tea party under mushroom lanterns, pastel watercolor style, gentle glow, playful and comforting mood suitable for a bedtime storybook.”

Negative: “no harsh shadows, no realistic gore.”

5. Advanced Techniques

  • Storytelling Prompts: Build a sentence sequence—setting ➜ protagonist ➜ conflict ➜ atmosphere—for richer cohesion.
  • Textual Inversion Tokens: Train a custom word that encapsulates a brand mascot or unique style, then drop it into any prompt for consistent results.
  • Style Cheat Sheets: Keep a personal list of go‑to adjectives and artist tags, updated from community repositories.
  • Iterative Refinement: Generate → note mismatches → add clarifiers or negatives → regenerate, a loop endorsed by most SD prompt guides.

6. Common Pitfalls

  1. Underspecifying the subject leads to generic or chaotic outputs.
  2. Contradictory adjectives (e.g., “noir pastel”) confuse token weighting.
  3. Overstuffing artist names can trigger unwanted style blending or copyright filters.
  4. Ignoring negatives allows stray watermarks or distorted anatomy.

7. Prompt‑Building Checklist

  1. Subject & Action
  2. Context & Setting
  3. Style/Medium Keywords
  4. Lighting & Composition Cues
  5. Mood/Tone Words
  6. Optional Lens or Texture Descriptors
  7. Negative Prompts
  8. One‑sentence narrative for Flux
  9. Iterate, review, refine