Luma Ray 2: how to write prompts the model actually understands
Luma · Updated:
Luma Ray 2 is Luma's large-scale video model in Dream Machine, trained directly on video data. It understands natural motion, realistic lighting, and physically correct interactions. Duration is 5 or 10 seconds, native 720p and 1080p with upscale to 4K. The main upgrade from Ray 1.6 — slow-motion is fixed, native 10-second limit, Camera Tags in the interface.
What's new in Ray 2
Ray 2 is the step from Ray 1.6 to realistic motion. Key improvements: realistic textures, smooth camera work, dynamic scenes, native duration up to 10 seconds at 720p, strong text-instruction comprehension, video input support. The slow-motion problem characteristic of Ray 1.6 is gone — motion is now real-time.
Ray 2 Flash is the accelerated version: 3x faster, 3x cheaper. Same capabilities (T2V, I2V, audio, control), with the focus on speed. Good for iteration and A/B tests; for final renders use standard Ray 2.
What's not yet in Ray 2 (available in Ray 3): Character Reference for character consistency, Draft Mode for fast iteration, 16-bit HDR. If you need these, switch to Ray 3.
- 5 or 10 seconds per run
- 720p native, 1080p, upscale to 4K
- Camera Tags in the Dream Machine interface
- Video input support (modify/V2V)
- Ray 2 Flash — 3x faster and cheaper
Prompt structure
Ray 2 formula: [Subject] + [Mid-Action Verb] + [Setting/Environment] + [Secondary Motion/Consequences] + [Camera Movement] + [Lighting/Mood].
Ray 2 is a «positive only» model: negative prompts are counter-productive. Describe only what you want to see. Use present continuous: «running», «pouring», «spinning» — not «begins to run», «will spin», «starts pouring». The model works in present continuous and doesn't understand temporal sequence.
Optimal length is around 100 words focused on the action. Under 15 words — the model fills in too much. Over 200 words — detail overload. The prompt «espresso pouring into a white ceramic cup, steam rising, liquid swirling, macro close-up, warm morning light» works better than a long technical brief.
Camera Tags and video input
Camera Tags are specialized camera-movement tags in the Dream Machine interface, available only for Ray 2 and Ray 2 Flash. Make sure Ray 2 is selected in settings, otherwise the tags are unavailable. You can combine multiple tags for complex movements (e.g. push in + tilt up).
Important: Camera Tags are a bonus, not a replacement for textual camera description in the prompt. Text still matters — tags complement, they don't replace. Write the specific movement in text: «camera dollies forward», «slow pan right», «aerial descending shot», and use tags as fine-tuning.
Ray 2 can accept video as input — this unlocks Modify (V2V) and Extend modes for existing clips. Describe the end state, not commands: «cyberpunk neon city at night, rain-slicked streets» works; «change the sky to blue» doesn't.
Secondary motion and keyframes
Secondary motion is what makes Ray 2 feel alive: wind in hair, raised dust, reflections on wet asphalt, fabric movement, ripples on water, steam over a drink, sakura petals in the air. Without these details even a moving subject looks static.
Examples: «dress billowing outward», «hair flowing in the wind», «dust particles catching golden hour sunlight», «steam rising from espresso», «ripples spreading across the pond», «city lights glowing in background, slightly out of focus».
Image-to-Video with keyframes: Start frame is required, End frame is optional. With keyframes the aspect ratio is taken from the image automatically. Describe ONLY what CHANGES between frames — don't re-describe static elements. Detailed descriptions of static parts throw the model off.
Common mistakes
1. Forbidden words in the prompt
«Vibrant», «whimsical», «hyper-realistic», «beautiful», «amazing», «stunning» degrade Ray 2 quality. Replace with concrete descriptions: «warm golden light», «soft pastel palette», «sharp focused detail», «cinematic film grain». These anchors give the model visual information, not empty qualifiers.
2. Temporal phrases and future tense
«Begins to», «starts to», «will», «then» — Ray 2 doesn't understand temporal sequence. Use present continuous (running, pouring, spinning) for the main action. For sequential shots use separate prompts with extend; don't try to fit a storyline into one prompt.
3. Negative prompts
«No text», «without people», «remove clouds» — Ray 2 is a «positive only» model. Replace with positive description: «clear blue sky» instead of «no clouds», «empty room» instead of «without people», «clean wall» instead of «no graffiti». The model works with what's there, not what isn't.
4. Re-describing static elements with keyframes
In image-to-video, don't re-describe what's already in the frames — Start and End frame give the model visual context. Describe only the CHANGES: «hair starts to flow», «camera dollies forward», «light shifts from cool to warm». Detailed descriptions of static elements throw the model off.
5. Conflicting camera moves
«Camera zooms in, pans left and orbits» at once — the model tries to do everything simultaneously and produces a chaotic, unstable result. Pick one main movement or combine them sequentially via extend. Camera Tags in the UI can be combined, but in text it's better to stick to one move type.
Before / after examples
Example 1
Before
espresso pouring, hyper-realistic, stunning
After
Espresso pouring into a white ceramic cup, steam rising, liquid swirling and forming crema, macro close-up, warm morning light from the left, shallow depth of field.
«Hyper-realistic» and «stunning» are forbidden words that degrade Ray quality. Replacement: concrete secondary motion (steam rising, liquid swirling, forming crema), macro framing, and explicit light direction (from the left).
Example 2
Before
dancer spins on rooftop, will be cinematic
After
Dancer spinning on a rooftop at sunset, dress billowing outward, hair flowing, city lights glowing in background, camera orbiting slowly, golden hour cinematic light.
«Will be cinematic» is a temporal phrase Ray ignores. The fix: present continuous «spinning», concrete secondary motion (dress billowing, hair flowing, city lights glowing), explicit camera (orbiting slowly) and light (golden hour cinematic).
Example 3
Before
golden retriever in field, no other animals, no people
After
A golden retriever running through an empty wheat field at sunset, ears flapping in the wind, dust particles catching golden hour sunlight, camera tracking alongside at the dog's level, warm cinematic light.
«No other animals, no people» is a negative prompt Ray ignores. The fix: positive «empty wheat field» directly in the scene description. This gives the model a clear picture: empty field, one subject.