How to write AI image prompts - From basic to pro [2024]

Learn how to write prompts for AI image generators like Midjourney, Stable Diffusion, DALL-E, Flux, and LetsEnhance.io. Refine your skills in prompt engineering to create stunning AI art and make the most of text-to-image apps.

Takeaways

Use natural language to paint a clear mental picture (e.g., "A curious red fox exploring a misty autumn forest at dawn" instead of "Fox, forest, autumn").
Include crucial elements: Subject, environment, lighting, colors, mood, and composition (e.g., "A majestic Bengal tiger with vibrant orange fur, stalking through a lush tropical rainforest dappled with sunlight").
Leverage platform-specific features like aspect ratios in Midjourney (--ar 16:9) or negative prompts in Stable Diffusion.
Experiment with prompt length: Test short (5-10 words), medium (up to 50 words), and long (50+ words) prompts to find what works best for each AI tool.
Upscale and enhance: Use tools like LetsEnhance.io to improve resolution and quality post-generation, especially for large prints or detailed edits.

How to write AI art prompts - LetsTalk (AI podcast)

0:00

/1381.632

#1 Start from the basic prompt structure

AI image generators work best with clear, structured prompts. This simple framework helps create detailed, rich outputs by guiding the AI effectively:

Subject: The main focus of the image.
Description: Context and details about the subject.
Style/Aesthetic: Artistic approach and visual framing.

These elements combine into a full prompt. For example: "The Batmobile stuck in Los Angeles traffic impressionist painting wide shot".

LetsEnhance-generated images. Prompt: The Batmobile stuck in Los Angeles traffic impressionist painting wide shot

Subject: Who and What

Start your prompt with the main subject, typically a noun. While a single subject can generate a general image, more descriptive prompts usually yield results closer to your vision.

💡 Pro tip: Avoid using abstract concepts (love, hate, justice, infinity, joy) as subjects. Use concrete nouns (human, cup, dog, planet, headphones) as the subject of your prompt for more accurate results.

Description: What They Are Doing, Where, and How

Enhance your prompt with adjectives to add depth and complexity. Include details that answer:

What is happening?
What is the subject doing?
How is the subject doing this?
What's happening around the subject?

The background is crucial, so don't neglect it in your description.

Example: "Raccoon reading" vs. "Professional photo of raccoon reading a book in a library photo, close shot"

The more detailed prompt resulted in a clearer, more complex and realistic image.

Another example demonstrates how additional descriptors can lead to a significantly more complex and detailed rendering.

"Finch" vs. "Yellow finch perched on a cherry blossom branch, spring background, soft lighting"

This generated image of a finch demonstrates how AI can render a generic, but very accurate image even without additional descriptors. However, the rendering on the right demonstrates how AI image generators, in this case can work with more elements to render a significantly more complex visual.

Aesthetic and Style: How It Looks

Complete your prompt with words that dictate the overall aesthetic and style:

Medium: "photo", "oil painting", "fresco", "3D rendering"
Art movements: impressionist, gothic, steampunk, etc.
Artist influences: Include famous names to blend their styles
Framing: "close up", "medium shot" to specify angle and distance

Example: "Handheld computer device" vs. "Handheld computer device, vaporwave aesthetic, product photography". Rendered using Lexica art.

The second prompt resulted in a more defined visual style with neon colors, showcasing the impact of style-specific keywords.

Another example:

"Wide shot" keyword significantly impacts the image composition. Adding this phrase can change the entire perspective of your generated image.

This structure balances specific details with room for AI creativity. It helps the AI generate images that match your idea while still using its own capabilities.

Using this basic structure is the first step in writing effective prompts. It prepares you for more advanced techniques in prompt writing.

#2 Keep it natural and descriptive

Write your prompts in plain, conversational language. Imagine you're describing the image to someone who can't see it.

Good enough: "Fox, forest, autumn, misty, sunlight, 8k, best quality"

Better: "A curious red fox exploring a misty autumn forest at dawn. Golden sunlight filters through colorful leaves, casting dappled shadows on the forest floor. The fox's fur is slightly damp from the morning dew, and its breath is visible in the cool air."

#3 Experiment with length and structure

While some AIs work well with longer prompts, others perform better with concise descriptions. Test different lengths and structures to find what works best for each platform.

1) Short prompt (10-20 words): "Cyberpunk cityscape, neon lights, flying cars"

Prompt: Cyberpunk cityscape, neon lights, flying cars

2) Medium prompt (30-50 words): "A bustling cyberpunk metropolis at night. Towering skyscrapers adorned with holographic advertisements. Neon signs in vibrant blues and pinks illuminate crowded streets. Flying cars weave between buildings."

Prompt: A bustling cyberpunk metropolis at night. Towering skyscrapers adorned with holographic advertisements. Neon signs in vibrant blues and pinks illuminate crowded streets. Flying cars weave between buildings.

3) Long prompt (50+ words): "A sprawling cyberpunk cityscape stretches as far as the eye can see. Massive skyscrapers, their surfaces a patchwork of screens and holograms, pierce the smog-filled sky. At street level, a sea of people navigate neon-lit marketplaces and food stalls. Hovering advertisements cast their glow on the rain-slicked streets below. Flying cars and delivery drones zip between buildings, leaving trails of light in their wake. In the foreground, a group of diverse characters in futuristic attire stand on a rooftop, overlooking the chaotic beauty of the city."

Prompt: A sprawling cyberpunk cityscape stretches as far as the eye can see. Massive skyscrapers, their surfaces a patchwork of screens and holograms, pierce the smog-filled sky. At street level, a sea of people navigate neon-lit marketplaces and food stalls. Hovering advertisements cast their glow on the rain-slicked streets below. Flying cars and delivery drones zip between buildings, leaving trails of light in their wake. In the foreground, a group of diverse characters in futuristic attire stand on a rooftop, overlooking the chaotic beauty of the city.‌‌

Experiment with advanced prompt elements

Include key elements like the subject, environment, lighting, colors, mood, and composition. The more specific you are, the better the AI can interpret your vision.

Alternative, more advanced prompt elements

Subject: "A majestic Bengal tiger with vibrant orange fur and black stripes"
Environment: "In a lush tropical rainforest with towering trees and dense undergrowth"
Lighting: "Dappled sunlight filtering through the canopy, creating a play of light and shadow"
Colors: "Rich greens of the foliage contrasting with the tiger's orange coat"
Mood: "A sense of tension and anticipation as the tiger stalks its prey"
Composition: "The tiger is positioned in the lower left third of the frame, its gaze directed towards the right"

Basic prompt: "Tiger" vs. advanced prompt "Prompt: majestic bengal tiger stalking through a lush tropical rainforest. Dappled sunlight filtering through the canopy, creating a sense of tension and anticipation. Tiger is in lower left, gazing towards the right."

Understand and leverage token weighting

Some platforms allow you to assign more importance to certain words or phrases in your prompt.

Midjourney example: Use double colons (::) to increase weight (e.g., "blue sky::2" makes "blue sky" twice as important)
Stable Diffusion example: Use parentheses to increase weight (e.g., "(blue sky)" or "(blue sky:1.5)")

#4 Focus on what you want, not what you don't

Instead of listing what you don't want in the image, focus on describing what you do want. Most AI models respond better to positive instructions.

Exception: Some models, like Stable Diffusion, allow for "negative prompts" to explicitly exclude certain elements. Use these sparingly and strategically.

#5 Use reference images and style modifiers

Many AI image generators allow you to upload reference images or use style modifiers to guide the output.

Example: "Create an image in the style of [Artist Name] or similar to [link to reference image]"

Popular style modifiers: "oil painting", "watercolor", "digital art", "photorealistic", "anime style", "low poly", "impressionist"

LetsEnhance: AI image generation and enhancement

LetsEnhance.io offers a powerful combination of image generation and enhancement capabilities, making it an excellent choice for creating high-quality, detailed images.

Key features:

Advanced text-to-image generation
Industry-leading upscaling technology (up to 16x or 512 megapixels)
Visual prompt builder for easier prompt creation

Tips for effective LetsEnhance.io prompts:

Leverage natural language: Write descriptive sentences rather than keyword lists.
Utilize prompt length flexibility: LetsEnhance.io can handle both concise (5-10 words) and detailed (50+ words) prompts effectively.
Be specific about details: Describe the subject, environment, lighting, and mood in detail.
Specify colors explicitly: Don't assume the AI will infer color information.
Avoid technical camera jargon: Terms like "shallow depth of field" or specific f-stops are often ignored.
Use the visual prompt builder: This tool simplifies the process of creating complex prompts.

Example prompt: "A serene landscape featuring a crystal-clear mountain lake at sunrise. The water reflects the pink and orange sky like a mirror. In the foreground, a majestic pine tree stands tall, its branches framing the view. Snow-capped peaks rise in the distance, their edges softened by a light morning mist. A pair of deer drink from the lake's edge, creating gentle ripples on the otherwise still surface."

Midjourney

Midjourney v6 and later excel at creating highly detailed and realistic images. It responds well to natural language descriptions and specific artistic styles.

Tips for effective Midjourney v6.1 prompts:

Use conversational, naturally flowing sentences to describe your desired image.
Include specific details about materials, ethnicity, age, clothing, colors, textures, shapes, hairstyles, and emotions.
Incorporate cinematic and photographic terms for more realistic outputs (e.g., "movie still", "cinematic lighting", "depth-of-field").
Experiment with aspect ratios using the --ar flag (e.g., "--ar 16:9" for widescreen).
Combine multiple concepts or characters in a single scene for more dynamic images.
Use the --v {Midjourney version} flag to ensure you're using the latest version.

Advanced Midjourney techniques:

Use --seed [number] to recreate or iterate on specific generations
Employ --stylize or --s to control the balance between your prompt and Midjourney's default style (e.g., --s 100 for maximum prompt adherence)

Midjourney example prompt: "A medieval knight in a forest. The knight is tall, wearing shiny silver armor with intricate engravings, a blue cape flowing behind him, holding a gleaming sword. He has a determined expression, short brown hair peeking from his helmet. The forest is dense with towering trees, the ground covered in lush green moss, dappled sunlight filtering through the leaves. It's early morning with a hint of mist in the air. Painted in a realistic style with vibrant colors. --ar 16:9 --v 6.1" — Example prompt: "A medieval knight in a forest. The knight is tall, wearing shiny silver armor with intricate engravings, a blue cape flowing behind him, holding a gleaming sword. He has a determined expression, short brown hair peeking from his helmet. The forest is dense with towering trees, the ground covered in lush green moss, dappled sunlight filtering through the leaves. It's early morning with a hint of mist in the air. Painted in a realistic style with vibrant colors. --ar 16:9 --v 6.1"

Stable Diffusion (SDXL and SD3)

Stable Diffusion models, particularly SDXL and SD3, are highly versatile and can handle a wide range of prompting styles, from concise to extremely detailed.

Tips for effective Stable Diffusion prompts:

Use natural language descriptions with specific details about the scene, style, and mood.
Experiment with longer, more descriptive prompts, as newer models can handle complex instructions.
Include information about lighting, camera angles, and artistic styles for more control over the output.
Try using negative prompts to specify what you don't want in the image.
Incorporate creative and conceptual ideas, as these models excel at interpreting unique concepts.

Advanced Stable Diffusion techniques:

Use attention weighting with () and [] to emphasize or de-emphasize elements
Employ prompt scheduling to change the prompt partway through generation
Experiment with different samplers (e.g., Euler a, DPM++ 2M Karras) for varied results
Utilize ControlNet for more precise control over composition and pose

Example SDXL prompt: "A majestic lion standing on a cliff overlooking a savanna at sunset, cinematic lighting, ultra-detailed fur, 8K resolution. The lion's mane is flowing in the warm breeze, its eyes fixed on the horizon. The sky is a breathtaking array of oranges, pinks, and purples, with a few scattered clouds catching the last rays of sunlight. In the distance, a herd of elephants can be seen making their way to a watering hole."

Example SD 3 prompt: "Create an ominous expressionist-style painting depicting an ancient abandoned temple on Pluto, with the moon Charon looming menacingly overhead. The temple, characterized by crumbling columns and archaic architecture, is set against a stark, otherworldly landscape of icy plains and dark skies. The temple should be positioned in such a way that its ruins frame the moon Charon, emphasizing its eerie and dominating presence. The scene is bathed in an ethereal light, casting long shadows and creating a dramatic, haunting atmosphere. The color palette should consist of deep blues, grays, and whites, enhancing the chilling, alien feel of the setting."

DALL-E and ChatGPT

DALL-E (and image generation through ChatGPT) works well with straightforward, detailed descriptions.

Tips for effective DALL-E prompts:

Use clear, concise language without relying on specific styling keywords.
Break down complex scenes into separate elements.
Be specific about composition, perspective, and style.
Leverage DALL-E's strengths:

Excellent understanding of spatial relationships and complex scenes
Strong ability to generate text within images
Good at following specific style instructions without needing predefined style keywords

DALL-E 3 specific tips:

Utilize its improved understanding of complex prompts and multi-step instructions
Take advantage of its enhanced ability to generate coherent text within images
Experiment with more abstract and conceptual prompts, as DALL-E 3 has improved interpretation of metaphorical language

Example prompt: "Create an image of a cozy bookstore interior. Show tall wooden bookshelves lining the walls, filled with colorful books of various sizes. In the foreground, place a comfortable leather armchair next to a small round table with a steaming cup of coffee and an open book. Warm, soft lighting from antique brass lamps illuminates the scene, creating a welcoming atmosphere. A cat is curled up on a window seat, looking out at a rainy street. Style the image like a digital illustration with a warm color palette, emphasizing the contrast between the cozy interior and the gloomy weather outside."

Flux: New open-source AI image generator

Flux is a versatile, open-source AI image generator that excels at interpreting both traditional keyword-style prompts and more detailed natural language descriptions.

Tips for effective Flux prompts:

Experiment with both keyword-style prompts and natural language descriptions to find what works best for your needs.
Take advantage of Flux's ability to handle longer prompts (up to around 500 tokens) for more detailed control over the output.
Include specific details about subjects, environments, lighting, composition, and mood in your prompts.
Consider using technical details like camera settings if desired for a more photorealistic output.
Experiment with different levels of detail in your prompts to find the sweet spot for your specific ideas.
Use AI tools like ChatGPT or Copilot to expand shorter prompts into more detailed descriptions if needed.

Do's:

Write detailed, descriptive sentences about what you want in the image
Include specifics about subjects, environments, lighting, composition, etc.
Describe the mood and atmosphere you want to convey
Use AI tools to expand short prompts if needed
Experiment with different levels of detail to find what works best

Don'ts:

Don't rely solely on comma-separated keyword lists (although Flux can handle these)
Avoid overly technical or jargon-heavy language that may confuse the model
Don't include unnecessary keywords that may not contribute to the desired outcome
Avoid extremely long prompts over 500 tokens

Example Flux prompts and image results:

Simple keyword-style prompt: "A photo of a bulldog, on a beach, with palm trees, at sunset"

Photograph of a Bulldog running on tropical beach at golden hour, leaving footprints in the soft sand. In the background, tall palm trees sway gently, casting dappled shadows on the sand. The sun is setting, casting a warm, golden hue over the entire scene, with a vibrant orange and pink sky.

Elaborate scene description: "A stunning photograph of a playful bulldog on a pristine tropical beach. The backdrop of the scene features swaying palm trees and a beautiful sunset, with the sky painted in hues of pink and orange. The bulldog, wearing a colorful bandana, is digging into the soft sand, leaving a small hole behind. The image evokes a feeling of relaxation and blissful escape from the daily grind."

Detailed prompt with technical specifications: "Ultra High Resolution Photo of a majestic elven princess standing in the midst of a sun-kissed woodland. She exudes an ethereal grace, dressed in a gown made of delicate leaves, flowers, and vines, while the warm sunlight filters through the trees, casting a golden light on her. The camera used for this shot is a Sony Alpha 7 III with a zoom lens, and the settings are ISO 320, shutter speed 1/1000 and a medium depth of field. The photo is edited in a natural and bright style, with vibrant colors that showcase the natural beauty of the forest."

Prompt with specific styling and camera details: "Ultra High Resolution photo of a 12-year-old boy wearing a blue jumpsuit flying a kite on a tropical beach. The shot is influenced by the style of renowned National Geographic photographer, Jimmy Chin. The image is captured with a Nikon D850 and a Wide Angle lens, using ISO 200, a fast shutter speed of 1/1000 and a shallow depth of field. The photo is edited with a natural and vibrant color style."

Remember, Flux's flexibility allows you to experiment with different prompting styles. You can start with simpler prompts and gradually add more details to achieve the desired results. The key is to find a balance between providing enough information to guide the AI and leaving room for the model's creative interpretation.

Improving resolution and quality

While many AI models have limitations on output size, you can use upscaling tools to increase resolution and enhance quality:

Generate your image using your preferred AI tool.
Use LetsEnhance.io's built-in upscalers to increase resolution up to 16x or 512 megapixels.
Choose the appropriate upscaler based on your image type:

Smart Enhance for photos
Digital Art for illustrations and paintings
Image with Text for preserving small details
Old Photo for early digital or noisy images
Magic for creative enhancements guided by text descriptions

Alternative upscaling methods:

Topaz Gigapixel AI: Powerful AI-driven upscaling with excellent detail preservation
waifu2x: Free, open-source upscaler optimized for anime-style art
Cupscale: GUI for various AI upscaling models, allowing for easy comparison and customization

Post-processing techniques:

Use tools like Adobe Photoshop or GIMP for final touch-ups and adjustments
Experiment with AI-powered editing tools like ARC Lab's Remini for facial enhancement or Topaz DeNoise AI for noise reduction
Consider manual retouching for critical areas that AI upscalers may not handle perfectly

Ethical considerations and best practices

Understand the potential biases in AI-generated imagery and work to counteract them in your prompts.
Be mindful of copyright and intellectual property issues when referencing specific artists or styles.
Consider the environmental impact of AI image generation and use resources responsibly.
Be transparent about the use of AI-generated images in your work, especially in professional contexts.
Stay informed about the evolving legal and ethical landscape surrounding AI-generated content.

Continuous learning and improvement

Join online communities dedicated to AI art generation (e.g., Reddit's r/midjourney, Discord servers for various tools).
Follow AI art creators on social media platforms like Instagram and Twitter for inspiration and tips.
Participate in challenges and prompt-sharing initiatives to expand your skills.
Keep a prompt journal to track your experiments and successes.
Stay updated on new features and model releases for your preferred AI image generation tools.

How LetsEnhance Image Generator Bypasses a Common Problem

Many AI generators have resolution caps to manage processing power. LetsEnhance integrates upscaling AI with image generation AI to bypass this limitation.

Resolution caps for the biggest AI image generation platforms

After creation, you can choose between the native 1024x1024px image or upscale it 4x to 4096x4096px without quality loss. This high-resolution output exceeds what most generators offer, even with premium packages. On top of that, you can enlarge the ai art even further – up to 512MP – using our built-in upscalers.

Generating and upscaling AI art with LetsEnhance

After an image is created, you have the option of receiving it in its native 512 by 512px or upscaling it 4 times to 2048 by 2048px without a drop in quality or the crispness of the image. This is a higher resolution than what most image generators are putting out today, even with the most expensive service packages. This is a great feature for those who work with AI image generators and are looking for high-quality images with a flexible price model.

FAQ

Q: What is the difference between text and image-reference prompts?

A: AI image generation uses two main prompt types:

Text Prompts: Describe the desired image using words or sentences. Different platforms may produce varying results from identical text prompts.

Caption: The Let’s Enhance Image Generator uses text prompts to render images.

The highlighted textbox is an example text prompt used to render images with AI using the Let’s Enhance Image Generator. Depending on the rendering platform you’re using, identical text prompts will have varying results.

Image Prompts: Upload reference images for the AI to use. This can be more effective than text for certain tasks.

Caption: Original: Girl with a Pearl Earring by Johannes Vermeer. Outpainting by: August Kamp

Example: DALL-E's Outpainting can extend existing images, like continuing the "Girl with a Pearl Earring" painting beyond its original borders.

A Mix of Both: Some platforms allow combining text and image prompts for more precise results.

Understanding these prompt types and their applications can significantly improve your AI image generation outcomes.

Q: What can I do if the AI never renders what I was expecting?

A: As we pointed out in one of our short video tutorials, it's best to take it one step at a time with text prompts. First start with the subject and see how it works. Next, try adding a descriptor and if you notice that the renderings are not very clean or comprehensible, try a different descriptor with a similar meaning. Do this until the AI image generator comes as close to what results you were looking for as possible.

Q: What makes a good AI art prompt?

A: A good AI art prompt is specific, descriptive, and includes details about the subject, environment, lighting, colors, and mood. Use natural language and focus on painting a clear mental picture. For example: "A vibrant sunset over a bustling cityscape, with warm orange and pink hues reflecting off glass skyscrapers. Street lights are just beginning to flicker on, and people hurry home from work, creating long shadows on the sidewalks."

Q: How specific should AI image prompts be?

A: AI image prompts should be as specific as possible without becoming overly complex. Include key details about the scene, style, and mood you want to convey. However, different AI models have varying capabilities, so experiment with prompt length and detail to find the sweet spot for each platform.

Q: Can you use negative prompts in AI image generation?

A: Yes, some AI image generators, like Stable Diffusion, allow for negative prompts. These tell the AI what to avoid in the image. For example, you might use a negative prompt like "blurry, low quality, distorted proportions" to encourage higher quality outputs. However, it's generally more effective to focus on positive descriptions of what you want to see in the image.

Q: How do you describe art styles in AI prompts?

A: To describe art styles in AI prompts, use clear, well-known terms and combine them with descriptive language. For example:

"In the style of Van Gogh's 'Starry Night', with swirling brushstrokes and vibrant colors"
"Minimalist watercolor painting with delicate, translucent washes of pastel colors"
"Bold, graphic art deco poster design with geometric shapes and metallic gold accents" You can also reference specific artists, art movements, or time periods to guide the AI's style interpretation.

Q: What are the best practices for AI image prompt writing?

A: Some best practices for writing effective AI image prompts include:

Use clear, descriptive language
Be specific about details (subject, setting, lighting, colors, mood)
Experiment with prompt length to find what works best for each AI tool
Use style modifiers and artistic references when appropriate
Leverage platform-specific features (like token weighting in Midjourney)
Iterate and refine your prompts based on the results
Keep a prompt journal to track successful techniques

Q: How do different AI image generators interpret prompts?

A: Different AI image generators have unique strengths and interpret prompts in various ways:

Midjourney excels at artistic and stylized images, responding well to style keywords and weighting
DALL-E is good at understanding complex scenes and generating coherent text within images
Stable Diffusion is versatile and supports advanced techniques like LoRA and ControlNet
LetsEnhance.io combines generation with powerful upscaling, handling both concise and detailed prompts effectively

Experiment with each platform to understand its unique characteristics and adapt your prompting style accordingly.

Q: What elements should be included in an AI art prompt?

A: A comprehensive AI art prompt should include:

Subject: The main focus of the image
Environment: The setting or background
Lighting: Quality, direction, and color of light
Colors: Specific color palette or important color elements
Mood/Atmosphere: The emotional tone of the image
Composition: How elements are arranged in the frame
Style: Artistic style or technique (e.g., "oil painting", "photorealistic")
Details: Any specific elements or features you want to include

Q: How can you improve your AI art prompt results?

A: To improve your AI art prompt results:

Study successful prompts from other users
Use more specific and descriptive language
Experiment with prompt length and structure
Leverage platform-specific features (e.g., weighted terms, aspect ratios)
Use seed values to iterate on promising results
Combine AI generation with post-processing and upscaling techniques
Practice regularly and keep track of what works best for different types of images

Q: How do you use keywords effectively in AI image prompts?

A: To use keywords effectively in AI image prompts:

Place important keywords near the beginning of the prompt
Use specific, descriptive words rather than vague terms
Employ token weighting (e.g., "blue sky::2" in Midjourney) for crucial elements
Combine keywords with natural language descriptions
Use style keywords to guide the overall aesthetic (e.g., "impressionist", "cyberpunk")
Include action words to convey movement or energy
Experiment with synonym variations to find the most effective terms

Remember, mastering AI art prompts is an ongoing process. Keep experimenting, learning from the community, and refining your techniques to create increasingly impressive AI-generated images.