Can I use Gen-4 without an input image?

No, Gen-4 is image-to-video only — the input image is mandatory. The architecture has no T2V mode. If you need full text-to-video, use Runway Gen-4.5, which is Runway's first model with native T2V plus I2V. Gen-4 shines at animating still photos and product references where the visual is already nailed.

What's the difference between Gen-4 and Gen-4 Turbo?

Turbo costs 5 credits/sec versus 12 for full Gen-4, renders faster, and is slightly weaker on complex physical detail. Use Turbo for prototyping and trying camera moves, then render the final pass on Gen-4. The prompt structure is identical for both versions — only credits and final quality differ.

How long should a Gen-4 prompt be?

Sweet spot is 10–30 words. Runway officially says «Clarity matters more than structure» — a short, sharp prompt often beats a long one stuffed with qualifiers. A 5–10 second clip simply can't fit complex blocking, so extra detail either gets ignored or causes camera drift.

Does Gen-4 support negative prompts?

No, negative prompts are not supported — that's a documented limitation. Constructs like «no rain», «without text», «avoid blur» can backfire: the model sees the keywords and sometimes generates exactly what you wanted excluded. Describe the desired state positively — «clear sky», «empty signage», «sharp focus».

What resolution and duration are available?

Native resolution is 720p with optional upscale to 4K at the final step. Duration is fixed at 5 or 10 seconds — no intermediate values. For longer narratives you need to stitch multiple generations or switch to Gen-4.5, which offers flexible 2–10 second duration and timestamp syntax for sequential beats.

Can I change outfits or objects on the reference via prompt?

No, Gen-4 does not edit the reference visual — it's not an edit model. Commands like «change the dress to blue» or «add a hat» are either ignored or break consistency. To swap objects in the reference, run an image edit first (GPT Image 2 or Flux Kontext) and feed the edited image into Gen-4 as the input.

Does Opten support Runway Gen-4?

Yes, the Opten extension auto-detects Runway inside runwayml.com and scores prompts against the structure specific to Gen-4: it checks for movement and a camera directive, the absence of negative constructions and reference descriptions, and prompt length in the optimal 10–30 word range. One click gives you a rewrite in the right I2V structure.

Video

Runway Gen-4: how to write prompts the model actually understands

Name: Runway Gen-4
Brand: Runway

Runway · Updated: May 19, 2026

Runway Gen-4 is an image-to-video model from Runway with native 720p (upscale to 4K) and a fixed duration of 5 or 10 seconds. Generation cannot run without an input image — Gen-4 is I2V-only. The prompt describes ONLY movement and camera; the visual is locked by the reference. Negative prompts and JSON are ignored.

What Runway Gen-4 does well

Gen-4 is a dedicated I2V model: it always needs an input image, and you don't have to describe the scene — that's already fixed in the frame. Strengths are cinematic camera moves and animating still photos with subtle physical detail (hair in a breeze, fabric folds, small gestures).

Gen-4 Turbo is the lighter tier at 5 credits/sec instead of 12. Use it for prototyping and quick iteration, then finalize on full Gen-4. Turbo tolerates slightly less detailed prompts.

Image-to-Video only — no reference, no generation
720p native, upscale to 4K at the final step
Duration 5 or 10 seconds (fixed choices)
12 credits/sec (Gen-4) or 5 credits/sec (Gen-4 Turbo)
No support for negative prompts or JSON formatting

Prompt structure

Because the visual is already set by the image, the prompt describes movement only. The base formula is [Camera movement]: [subject] [action]. [Additional motion details].

Optimal length is 10–30 words. A short prompt (10–15 words) often beats a long one — Runway officially says: «Clarity matters more than structure». No greetings, explanations, JSON, or commands like «add rain».

Active verbs in present tense: «walks», «pulls back», «rotates slowly». One clear camera move beats several simultaneous ones — Gen-4 struggles with zoom + pan + orbit combinations in a single scene.

Camera vocabulary

Gen-4 understands the standard cinematic lexicon well because it was literally pulled from the training data. Basic moves: dolly in/out, truck left/right, pan left/right, tilt up/down. Advanced: crane shot, arc shot, whip pan, crash zoom, push-in, pull-out. Camera style: handheld, steadicam, gimbal, smooth tracking, static.

Set one main move plus an optional speed modifier — «slowly», «suddenly», «gradually». This controls pacing without overloading the model.

Turbo vs Gen-4: when to use which

Turbo costs 5 credits/sec and renders faster — ideal for trying camera moves, exploring variations, A/B-testing ideas. Full Gen-4 is the final render once the movement and timing are confirmed.

Practical pipeline: 3–5 iterations on Turbo (40–50 credits for a 10-second clip), then one final render on Gen-4 (120 credits). That's 2–3× cheaper than iterating directly on the full model. For production campaigns with dozens of clips the budget difference adds up fast.

Common mistakes

1. Running without an input image
Gen-4 is an I2V-only model — generation is physically impossible without a reference. This is not a bug or a workaround target; the architecture has no T2V mode. If you need text-to-video on Runway, use Gen-4.5. Always confirm there's an image attached in Generation Settings before launching.
2. Describing the scene instead of the movement
A prompt like «a woman in a red dress in a park, sunset, beautiful» is useless — that information is already in the reference. The prompt should start with a movement verb or a camera move type. The scene is locked by the image; your prompt is the operator's instruction for what to shoot next.
3. Negative prompts
«No clouds», «no blur», «without text» in Gen-4 can produce exactly what you tried to exclude — the model sees «clouds», «blur», «text» as tokens and sometimes generates them. Describe what you want positively: instead of «no fast motion» write «slow, deliberate movement».
4. Multiple camera moves at once
«Pan left while zooming in and rotating» comes out as undirected camera drift in Gen-4. Pick one main move (dolly in OR pan OR orbit) plus an optional speed modifier. Five to ten seconds is not enough screen time for complex blocking — the model can't fit it cleanly.
5. JSON formatting and command-style prompts
Structures like `{"camera": "dolly", "subject": "woman"}` or commands like «add rain», «remove the hat» are ignored by Runway — it's not a command-driven model. Write in natural language, full sentences: «Light rain begins to fall as the camera slowly pulls back.»

Before / after examples

Example 1

Before

make a nice video with this photo where the woman in red is in the park smiling, add some motion

After

Slow dolly-in toward the subject. The woman gently tilts her head and smiles softly. Subtle hair movement in the breeze. Smooth tracking, cinematic pacing.

The old version describes the reference (dress, park); the new one describes only movement and camera. Active verbs in present tense, one camera move, soft physical detail.

Example 2

Before

make a dynamic product video from different angles

After

Slow orbital arc shot around the product, 180-degree sweep. Subtle product highlights catch the light as the camera moves. Smooth steadicam motion, no jitter.

A concrete camera move (orbital arc, 180°) instead of vague «different angles». Stabilization is specified — this yields a clean commercial render instead of jittery output.

Example 3

Before

bring this portrait to life, add emotion, no background blur

After

Slight head turn to the left. The subject blinks once, then breaks into a soft smile. Static camera, shallow depth maintained on the eyes.

Removed the negative «no blur» — it doesn't work in Gen-4. Replaced with positive «shallow depth maintained». Micro-gestures (blink, smile) are a strength of I2V.

Runway Gen-4: how to write prompts the model actually understands

What Runway Gen-4 does well

Prompt structure

Camera vocabulary

Turbo vs Gen-4: when to use which

Common mistakes

1. Running without an input image

2. Describing the scene instead of the movement

3. Negative prompts

4. Multiple camera moves at once

5. JSON formatting and command-style prompts

Before / after examples

Frequently asked

Related models

Google Veo 3.1 (incl. Veo 3.1 Fast and Veo 3.1 Fast Relax)

Google Veo 3

Google Veo (General)

Ready to write Runway Gen-4 prompts in one click?