Video

MiniMax Hailuo 02: how to write prompts the model actually understands

MiniMax · Updated:

MiniMax Hailuo 02 is the predecessor of Hailuo 2.3, still relevant for its unique FL2V (First-and-Last-Frame-to-Video) mode and strong physics on extreme motion. Prompts are written as director's notes; bracket camera syntax `[Push in]` is supported. English is preferred; optimal length 40-60 words.

What Hailuo 02 does

Hailuo 02 is MiniMax's older video model, but not «outdated.» It has two unique aces that newer 2.3 doesn't have.

First — FL2V (First-and-Last-Frame-to-Video) mode: the model takes TWO frames (start and end) and generates a smooth transition between them. Indispensable for morphing, seasonal transformations (summer → winter), state changes of an object.

Second — extreme physics: gymnastics, parkour, acrobatics, complex physical motion. On scenes like that, 02 delivers more realistic dynamics than 2.3. Plus 512P support for rapid prototyping. For everything else — standard T2V and I2V — pick 2.3.

  • FL2V — unique first-and-last-frame mode
  • Extreme physics: gymnastics, parkour, acrobatics
  • Resolutions: 512P, 768P (default), 1080P
  • Duration: 6s or 10s (at 512P/768P); 6s at 1080P
  • Bracket camera syntax `[Push in]`, `[Tracking shot]`, up to 3 combined commands

Prompt structure

Style matches Hailuo 2.3 — director's notes in natural language, not tags. Optimal length 40-60 words, max 2000 characters.

Formula: [Camera + motion] + [Subject + description] + [Action in present tense] + [Style and atmosphere] + [Emotional markers].

Example: «[Push in] A young woman in a flowing red dress spins gracefully on a moonlit terrace, her hair catching the breeze. Cinematic, dreamlike atmosphere, soft warm rim light, serene emotional tone.» Verbs in present tense («spins», «catches»), brand semantics «[Push in]» — bracket syntax works.

FL2V — the unique mode

The headline feature of Hailuo 02. It takes two frames: first = the starting state of the scene, last = the ending state. The model generates a smooth transition. This is a different prompting style — not a scene description, but a description of the TRANSITION process.

Good FL2V prompt: «The flower gradually blooms, petals slowly unfurling outward, camera holding steady on a close-up.» Bad — describing the contents of the first or last frame (they're already set by images). Specify the transition character: smooth, abrupt, gradual. Specify camera behavior during the transition. If FL2V is enabled in settings but the second frame is missing — that's a critical error; the model can't generate.

Bracket Camera Syntax

Hailuo 02 supports the same syntax as 2.3 — precise cinematic control through square brackets. Core commands: `[Truck left]`, `[Truck right]` (horizontal trucking); `[Pan left]`, `[Pan right]` (panning); `[Push in]`, `[Pull out]` (in/out); `[Pedestal up]`, `[Pedestal down]` (camera height); `[Tilt up]`, `[Tilt down]` (tilt); `[Zoom in]`, `[Zoom out]` (zoom); `[Shake]`; `[Tracking shot]`; `[Static shot]`.

Combination: `[Pan left,Pedestal up]` — up to 3 simultaneous commands. Sequential: «...[Push in], then...[Pull out].» This is a model feature, not a formatting error — bracket syntax activates direct camera control.

Common mistakes

  1. 1. Tag-based prompts instead of sentences

    «cyberpunk, rain, neon, 4k» — Hailuo 02 was trained on narrative descriptions. Tag soup yields generic results with unpredictable dynamics. Write director's notes: «[Push in] Neon-lit Tokyo street, heavy rain falling on wet asphalt, lone figure walking through reflections.»

  2. 2. Quality boosters like «8k masterpiece»

    «ultra-detailed, 8k, masterpiece, best quality» cause excessive saturation and contrast in the final video. Quality is determined by scene, motion, and camera specificity — not magic tokens. On Hailuo 02 quality spam especially breaks motion physics.

  3. 3. Describing first/last frame contents in FL2V

    If FL2V is on, the first and last frames are defined by images — don't describe them. The prompt must describe the TRANSITION PROCESS between them: motion character, camera behavior, tempo. Restating frame contents wastes tokens and confuses the model.

  4. 4. FL2V without a second reference

    FL2V requires TWO images — first and last frame. If FL2V is selected in settings but only one or no image is loaded, that's a critical error and the model can't generate the transition. Before using FL2V, make sure both references are uploaded.

  5. 5. Using 02 when 2.3 is the right choice

    Hailuo 02 is the older model. If the task is standard (T2V or I2V without FL2V, without extreme physics), 2.3 is better: newer, more precise, with a cheaper Fast version. 02 only makes sense for FL2V, sports physics, or quick 512P tests. For most tasks — 2.3 is the right call.

Before / after examples

Example 1

Before

beautiful sunset turns into night

After

[FL2V mode, frame 1: golden sunset over the ocean; frame 2: deep blue night with stars]. The sky gradually transitions from warm golden tones to deep indigo, sun slowly sinking below the horizon, first stars beginning to twinkle. Camera holds steady on the wide horizon. Smooth, gradual atmospheric shift, peaceful contemplative mood.

An FL2V prompt describes the PROCESS of transition, not the frames (they're set by images). Transition character (gradual, smooth), camera behavior (holds steady), and emotional tone are explicit.

Example 2

Before

gymnast does a flip

After

[Tracking shot] A young female gymnast in a white leotard performs a backflip on a sunlit gymnastics floor, body fully extended mid-air, sharp focus on her arched form. Realistic physics, smooth body mechanics, dynamic energy. Sports broadcast aesthetic, tense and energetic emotional tone.

Extreme physics is Hailuo 02's strength. The `[Tracking shot]` bracket keeps the camera on the motion. Present-tense verb, explicit physical markers (body fully extended, arched form).

Example 3

Before

cat jumps onto the table

After

[Static shot] A ginger cat crouches on the kitchen floor, tail flicking, then leaps gracefully onto the wooden countertop, landing softly. Natural daylight from the window, calm domestic atmosphere, slight cinematic tension during the leap.

Static camera for a predictable shot, concrete verbs (crouches, flicking, leaps, landing), landing physics described (softly). Not tag soup like «cat, jump, kitchen, 4K.»

Frequently asked

How is Hailuo 02 different from Hailuo 2.3?
Hailuo 2.3 is the newer model, generally better for standard T2V/I2V tasks: more precise micro-expressions, more diverse art styles, and a cheap Fast version. Hailuo 02 is older but retains two unique aces: FL2V mode (first + last frame) and strong physics on extreme motion (gymnastics, parkour). Plus 512P support for rapid testing.
What is FL2V and when should I use it?
First-and-Last-Frame-to-Video — a mode where the model accepts two frames (start and end) and generates a smooth transition between them. Indispensable for morphing (summer → winter, flower before/after blooming, object transformation), controlled seasonal transitions, and state changes. The prompt describes the TRANSITION process, not the frames themselves — they're already set by images.
How does bracket camera syntax work?
It's a MiniMax feature — square brackets activate direct camera control. 15 commands available: `[Push in]`, `[Pull out]`, `[Pan left/right]`, `[Truck left/right]`, `[Pedestal up/down]`, `[Tilt up/down]`, `[Zoom in/out]`, `[Shake]`, `[Tracking shot]`, `[Static shot]`. Combination: `[Pan left,Pedestal up]`, up to 3 at once. Sequencing via «then»: «[Push in], then [Pull out].»
What's the optimal prompt length?
40-60 words, max 2000 characters. Too short — generic output; the model fills in. Too long — overload and face/object deformation. Target 40-60 words for T2V and FL2V; I2V prompts can be shorter (they describe only motion, not contents).
When pick 02 over 2.3?
Three scenarios: (1) you need FL2V — only in 02; (2) extreme physics — gymnastics, parkour, acrobatics, flips — 02 delivers more realistic dynamics; (3) rapid prototyping at 512P — 2.3 starts at 768P, on 02 you can iterate 10 variants at 512P, then move to 1080P for the final. In all other cases — 2.3.
Is Russian supported in prompts?
MiniMax supports multilingual input, but recommends English for international users and Chinese (the native training language). Russian technically works but yields less predictable results on complex scenes. Recommendation: keep the bulk of the prompt in English, with bracket camera commands.
Does Opten support Hailuo 02?
Yes, the Opten extension auto-detects MiniMax Hailuo 02 and scores prompts against the structure above: it checks for a concrete subject, present-tense verbs, bracket camera commands, optimal 40-60 word length, and absence of quality boosters. For FL2V — it checks both references are present and that the prompt describes the TRANSITION, not the frames. One click gives you a rewrite in the correct structure.

Related models

Ready to write MiniMax Hailuo 02 prompts in one click?

  • Auto-detects the model inside its native interface
  • Scores every line of your prompt
  • One-click rewrite into the correct structure
ChromeYandex BrowserChrome / Yandex BrowserInstall extension

Pro — $2.99/month or ₽199/month · cancel anytime

Stop Guessing. Generate
On The First Try.

Install Opten in 30 seconds and score your next prompt.

Opten is a Chrome extension that scores AI prompts for the specific model. Supports 60+ image and video models — Midjourney, GPT Image 2, Kling, Sora, Nano Banana, Flux — and rewrites them in one click inside the Syntx, Higgsfield, and Freepik interfaces. From $2.99/month.

© 2026 Opten · IE Nikolai Shupletsov · Tax ID 306389672