How is GPT Image 1.5 different from 1 and 2?

Vs GPT Image 1 — improved photorealism, face preservation during editing, multi-image input (up to 4 images), input_fidelity parameter, reliable text rendering. Vs GPT Image 2 it lags on SOTA text rendering (CJK, Cyrillic, Arabic), thinking mode, and reference count (1.5 supports up to 4, 2 supports up to 16). For most tasks 1.5 is a stable middle ground.

When should input_fidelity="high" vs "low" be used?

High — for preserving composition and identity during editing. Use it for face-preserving edits, precise background swaps, clothing changes without altering pose. Low — for creative freedom, reimagining, style transfer, variation generation. Default to high; use low when radical changes are desired.

How does multi-image input work?

Pass up to 4 images and reference each by index: «Image 1: ...», «Image 2: ...». Describe the interaction: «apply Image 2's style to Image 1», «put the bird from Image 1 on the elephant in Image 2». This enables style transfer, compositing, and complex edits through comparison. The key is explicit references — without them the model doesn't know what to do with what.

What prompt element order does OpenAI recommend?

For GPT Image 1.5 the recommended order is: [Background/Scene] → [Subject] → [Key details] → [Constraints]. Plus a use case up front («Product shot for...», «Infographic for...»). This differs from GPT Image 1 where the subject came first. For complex prompts, short bulleted segments work better than one long paragraph.

How do you make infographics and diagrams?

Specify audience («for students», «for executives») and type («timeline», «labeled diagram», «funnel chart»). Exact text labels in quotes. Concrete font, color palette, layout. Always `quality="high"` — small text breaks on medium. GPT Image 1.5 is one of the best in class for structured visuals.

Does the model support transparent background?

Yes, via the `background: transparent / opaque / auto` parameter. For stickers, icons, and assets use transparent. The prompt can additionally state «transparent background», but the parameter is what guarantees a clean alpha mask. Typical pattern: «cute cartoon knight sticker, thick lines, white outline» + `background="transparent"`.

Does Opten support GPT Image 1.5?

Yes, the Opten extension auto-detects GPT Image 1.5 and scores prompts against the structure outlined above: it checks for the recommended order (bg → subject → details), a stated use case, concrete camera terms, quotes for text, and absence of SD syntax. One click delivers a rewrite in the correct structure.

Image

GPT Image 1.5: how to write prompts the model actually understands

Name: GPT Image 1.5
Brand: OpenAI

OpenAI · Updated: May 19, 2026

GPT Image 1.5 is OpenAI's image model with improved photorealism, identity preservation during editing, and multi-image input. It supports resolutions up to 1536×1024, transparent background, three quality tiers, an input_fidelity parameter (high/low), and up to 4 images per request. Optimal prompt length is up to 500 words.

What's new in GPT Image 1.5

Version 1.5 brought ten concrete upgrades: improved photorealism with natural lighting and accurate materials, a flexible quality/speed balance (low quality already beats GPT Image 1's visual quality), face and identity preservation during editing, reliable text rendering, support for complex structured visuals (infographics, diagrams), and precise style control via minimal prompting.

Additional gains: strong real-world knowledge, improved composition preservation during edits, more accurate lighting, and higher detail on fine elements.

input_fidelity parameter (high/low) for edit control
Multi-image input — up to 4 images per request
Face and identity preservation during editing
Background: transparent / opaque / auto
Prompt up to ~4000 tokens, optimal up to 500 words

Prompt structure

OpenAI's recommended order: [Background/Scene] → [Subject] → [Key details] → [Constraints/Exclusions]. This differs from GPT Image 1, where the subject came first.

Also include the use case — «Product shot for an e-commerce listing», «Infographic for a student audience», «UI mockup showing a mobile app screen». This sets the mode and polish level.

For complex requests use short bulleted segments or line breaks instead of one long paragraph. A layered structure (subject, environment, lighting, style, technical parameters) yields clean and predictable output.

Multi-image input and editing

Multi-image is one of 1.5's key features. Reference each image by index: «Image 1: product photo with the watch on a white surface. Image 2: style reference, dark moody studio lighting. Apply Image 2's style to Image 1». For compositing: «put the bird from Image 1 on the elephant in Image 2».

For editing use the edit endpoint with input_fidelity. High fidelity preserves composition and identity (use for face-preserving edits); low allows creative freedom (style transfer, reimagining). State explicitly: «Change only X» + «keep everything else the same». On iterations repeat the preserve list — otherwise the model drifts.

Text and structured visuals

Exact text in quotes or CAPS: `"SUMMER SALE 50% OFF"`. Specify typography: font style, size, color, placement. For brands and rare words — letter by letter: `S-T-A-R-B-U-C-K-S`. For infographics with lots of text — `quality="high"`.

GPT Image 1.5 is especially strong on structured visuals: infographics, diagrams, multi-panel compositions, explanatory illustrations. Specify audience («for students», «for executives») and type («timeline», «labeled diagram», «funnel chart») — the model picks detail level and text density accordingly.

Common mistakes

1. Ignoring API parameters
`quality`, `background`, `input_fidelity`, and `num_images` affect output as much as the prompt text. Requesting a high-quality infographic with small text at `quality="medium"` guarantees blurry labels. Requesting a sticker without `background: transparent` gives a white background.
2. Stable Diffusion syntax
Weights like `(word:1.5)`, comma-separated tags `1girl, masterpiece, best quality`, embeddings, LoRA references — GPT Image 1.5 works with natural language, not tags. These constructions are ignored or degrade output. Write coherent sentences.
3. Overloading iterations
«Change hair, background, clothing, add glasses, make it cinematic» — the model tries to do everything at once and loses identity. Change one element at a time, repeating the preserve list at each step. GPT Image 1.5 is especially good at iterative work precisely because of face preservation.
4. Missing use case
«Make an infographic» — the model doesn't know the polish level or density. «Educational infographic for students explaining...» or «Pitch-deck slide for executives showing...» sets the mode. Use case influences style, font size, illustration density as much as the main subject.
5. Quality boosters «8K, ultra HD, masterpiece»
Generic quality praise is nearly useless. Concrete terms (lens, lighting direction, depth of field) work significantly better. Plus API parameters (`quality="high"`) give real control over final sharpness, unlike words in the prompt.

Before / after examples

Example 1

Before

beautiful product photo

After

Product shot for an e-commerce listing. A premium minimalist wireless headphone, matte black with brushed steel accents, placed on a minimalist white surface. Soft gradient lighting from the upper left, soft shadows beneath, slight reflection on the smooth surface. Professional studio photography, sharp focus, neutral cool color balance, quality="high".

Use case stated («e-commerce listing»), bg → subject → details order, concrete lighting and surface, explicit `quality="high"`.

Example 2

Before

infographic about the water cycle

After

Educational infographic for students explaining the water cycle. Clean white background with five labeled stages: "Evaporation", "Condensation", "Precipitation", "Collection", "Transpiration". Use bold sans-serif font for stage labels, soft blue color palette for water, warm yellow for sun. Connecting arrows between stages. Top title (centered): "The Water Cycle". quality="high".

Audience («for students»), exact text labels in quotes, concrete typography and palette, `quality="high"` for small text.

Example 3

Before

Change her hair color and the background and add glasses and make it cinematic

After

Image 1: portrait photo. Change only the hair color to deep auburn. Keep the same facial features, expression, pose, glasses or lack thereof, and clothing unchanged. Maintain identical lighting and background. input_fidelity="high".

Multiple edits in one prompt confuse the model. One precise edit with an explicit preserve list and `input_fidelity="high"` preserves identity.

GPT Image 1.5: how to write prompts the model actually understands

What's new in GPT Image 1.5

Prompt structure

Multi-image input and editing

Text and structured visuals

Common mistakes

1. Ignoring API parameters

2. Stable Diffusion syntax

3. Overloading iterations

4. Missing use case

5. Quality boosters «8K, ultra HD, masterpiece»

Before / after examples

Frequently asked

Related models

Z-Image (Base / Turbo)

Wan (General — 2.5 / 2.6)

Seedream 5 Lite

Ready to write GPT Image 1.5 prompts in one click?