> For the complete documentation index, see [llms.txt](https://docs.atlas.design/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.atlas.design/atlas-ai-studio-overview/node-index/video-nodes.md).

# Video Nodes

Video Nodes generate short animated clips directly from text prompts and optional reference images.\
They are ideal for creating **cinematic previews**, **marketing videos**, **quick scene animations**, or **visual storytelling elements** within Atlas workflows.

Two nodes are available: a full-featured version and a simplified fast-generation version.

## When to use video nodes

Reach for video nodes when a static image isn't enough and a fully-rigged animation pipeline is overkill. Common use cases:

* **Marketing and trailers.** Generate cinematic clips from concept art for pitch decks, store listings, social media, or community announcements.
* **Cutscene prototyping.** Roughly visualize a cinematic before committing animator time to a polished version.
* **Animated moodboards.** Turn a single style reference into a short looping clip that conveys mood and motion direction to the team.
* **NPC and character animation prototyping.** Use Lipsync to put a generated voice on a static character portrait for dialogue review, or Reference to Video to test motion choreography against a character reference.
* **Video edits and continuations.** Use Video Edit to restyle existing footage and Video Extend to grow short clips into longer sequences without re-rendering.

### Text + Image -> Video <a href="#text--image-to-video" id="text--image-to-video"></a>

Generates a video from a **text prompt**, an optional **continuation prompt**, and up to **three reference images**.

<figure><img src="/files/GB8qJAEHfJi9t9vtSUUc" alt="" width="563"><figcaption></figcaption></figure>

#### Inputs <a href="#inputs" id="inputs"></a>

* **Prompt** — main instruction for video content
* **Continuation Prompt** (optional) — describes how the animation should evolve
* **Reference Images** — up to **3** images to control style, subject, or composition

#### Video Settings <a href="#video-settings" id="video-settings"></a>

* **Duration:** 4, 6, 8, 16, 24, 32, or 40 seconds
* **Resolution:** 720p or 1080p
* **Aspect Ratio:** Landscape or Portrait

#### Output <a href="#output" id="output"></a>

* A rendered video clip in the chosen format

This node is suited for more controlled, style-specific video generation, especially when reference images are important.

#### Example Usecase <a href="#example-usecase" id="example-usecase"></a>

### Simple Text + Image -> Video <a href="#text--image-to-video-simple" id="text--image-to-video-simple"></a>

A streamlined version optimized for **fast, lightweight video generation**.

<figure><img src="/files/WQkvS0VC8zFlMRa2W1uG" alt="" width="563"><figcaption></figcaption></figure>

#### Inputs <a href="#inputs-1" id="inputs-1"></a>

* **Prompt** — primary description
* **Input Image** (optional) — style or subject reference

#### Video Settings <a href="#video-settings-1" id="video-settings-1"></a>

* **Duration:** 5, 10, or 12 seconds
* **Resolution:** 420p, 720p, or 1080p
* **Aspect Ratio:** Landscape, Portrait, Standard, or Square
* **Fixed Camera Position:** enable or disable
* **Seed:** control variation (`-1` = random)

#### Output <a href="#output-1" id="output-1"></a>

* A quick-rendered video clip

This version is ideal for rapid prototyping or generating simple animated assets for marketing or social media.

<figure><img src="/files/q3vBp5lPr6iIqwGBRaDE" alt="" width="540"><figcaption></figcaption></figure>

#### Example Usecase <a href="#example-usecase-1" id="example-usecase-1"></a>

* **End Image** input — provides control over both the start and end frames of the generated video clip.

<figure><img src="/files/zJVkDPIo0lp9NILB0O0V" alt="" width="563"><figcaption></figcaption></figure>

### Use Cases <a href="#use-cases" id="use-cases"></a>

* Marketing videos from a single concept image
* Animated moodboards
* Scene previews for game or environment design
* Quick animations for pitch decks or client presentations
* Stylized loops for social media

Video Nodes provide a fast way to bring static concepts to life using text prompts and reference imagery.

### Video Edit

Transforms an existing video by applying a new creative direction or environment using a reference image and text prompt.

<figure><img src="/files/LovEP1B8IZZzRazmKovt" alt="" width="563"><figcaption></figcaption></figure>

**Inputs**

* **Source Video** — the original video clip to transform
* **Reference Image** — visual guide for the target style, environment, or look
* **Prompt** — text description of the desired edit

**Parameters**

* **Backend selector** — choose motion-path generation method (some backends use prompt-based motion, others use reference-driven paths)
* **Seed** — control variation (`-1` = random)

**Output**

* Edited video clip matching the reference image style and prompt direction

**Useful for:** changing the setting or atmosphere of placeholder footage, creating environmental variations of cutscenes, or adapting generic vehicle or character animations into themed game contexts (expedition tours, combat zones, fantasy landscapes).

* Accepts input videos from 3 to 60 seconds in length
* Supports up to 5 reference images
* Output resolution selectable as 720p or 1080p
* Audio handling mode: automatic or original (preserve source audio)
* Some backends support instruction-based edits with optional style reference image

### Video Extend

Extends an existing video clip forward in time by generating additional frames based on a text prompt and the final frames of the input.

<figure><img src="/files/zTA2J36h0Yzo0Xe5KN2G" alt="" width="563"><figcaption></figcaption></figure>

**Inputs**

* **Source Video** — the video clip to continue
* **Prompt** — text description guiding the extended footage
* **Backend selector** — choose generation engine (different backends produce varying motion styles and continuation approaches)

**Parameters**

* **Duration** — length of the extension (available durations depend on the selected backend)
* **Resolution** — output resolution (options vary by backend)
* **Seed** — control variation (`-1` = random)

**Output**

* Extended video clip appended to the original

**Useful for:** creating longer cutscene sequences from short generated clips, looping environmental footage, or prototyping extended NPC actions and vehicle animations without re-rendering the entire scene.

### \[backend] Reference to Video

Generates video content by combining multiple reference inputs—images, video clips, and audio—with a text prompt to produce a cohesive animated result.

<figure><img src="/files/Ru5aNel1obDsDPSG4SQk" alt="" width="563"><figcaption></figcaption></figure>

**Inputs**

* **Prompt** — text description guiding the generation
* **Reference Images** — up to 9 still images for style, character, or environment guidance
* **Reference Videos** — up to 3 video clips (e.g., motion choreography, background loops, camera movement)
* **Reference Audio** — up to 3 audio clips to influence pacing, rhythm, or mood

**Parameters**

* **Seed** — control variation (`-1` = random)

**Output**

* Generated video clip synthesizing all provided references

<figure><img src="/files/6yYA0VzmO3ZM4YSk15GT" alt="" width="563"><figcaption></figcaption></figure>

**Useful for:** creating NPC dance sequences synced to in-game music, generating character performances driven by reference choreography, or producing cutscene animations that blend concept art, motion samples, and soundtrack cues.

### Lipsync

Synchronizes a character's mouth movements to match an audio track, producing a video of the character speaking the provided dialogue or narration.

<figure><img src="/files/TQ8n8turlAGYcjV9tfOO" alt="" width="563"><figcaption></figcaption></figure>

**Inputs**

* **Character Image** — static portrait or character reference
* **Audio Track** — voice recording or generated speech (often from a Text-to-Speech node)
* **Video Clip** (optional, backend-dependent) — some backends accept a video input instead of a static image

**Parameters**

* **Backend selector** — four motion-synthesis methods; each exposes different tuning options for facial animation style and sync accuracy
* **Seed** — control variation (`-1` = random)

**Output**

* Video clip of the character speaking in sync with the audio

**Useful for:** generating dialogue sequences for NPCs, prototyping cutscene performances, animating narration for tutorial characters, or creating spoken variations of in-game quest briefings.

<figure><img src="/files/8sEmQzURf9JKxkMt8VHB" alt="" width="563"><figcaption></figcaption></figure>

## Common pitfalls

* **Treating video like a hero-asset deliverable.** Generated video is best for prototyping, marketing, and pre-production storyboarding. For final in-game cinematics, expect to use the output as reference for traditional animation rather than as the shipped asset.
* **Mismatched aspect ratio across reference images.** When using multiple reference images with Text + Image → Video, dramatically different aspect ratios produce inconsistent framing. Pre-crop references to a consistent shape via 2D Post Processing.
* **Expecting backends to share durations and resolutions.** Different backends in Video Extend and Reference to Video support different duration and resolution options. Don't assume parameter parity across backends; check the dropdown for each.
* **Forgetting that Lipsync needs clean audio.** Background noise, music behind dialogue, or low-quality recordings produce visibly worse lip movement. Generate or clean the audio (via Audio Nodes) before feeding Lipsync.

## Related nodes

* [Input Nodes](/atlas-ai-studio-overview/node-index/input-nodes.md) — Input Image and Input Images supply reference content for video generation; Input Video supplies clips for Video Edit and Video Extend.
* [Image Nodes](/atlas-ai-studio-overview/node-index/image-nodes.md) — generate or refine reference frames before producing video.
* [Audio Nodes](/atlas-ai-studio-overview/node-index/audio-nodes.md) — generate voice tracks for Lipsync and music tracks for Reference to Video.
* [Utility Nodes](/atlas-ai-studio-overview/node-index/utility-nodes.md) — combine multiple references, prompts, or extracted document content into video-ready inputs.

## Frequently asked questions

**What's the difference between Text + Image → Video and Simple Text + Image → Video?**

The full version takes more parameters (continuation prompt, multiple reference images, longer durations) and produces more controlled output. The Simple version is optimized for fast iteration with fewer knobs. Use Simple for rapid prototyping, full for final-quality moments.

**How long can the generated videos be?**

Depends on the node. Text + Image → Video supports durations from 4 to 40 seconds. Simple Text + Image → Video supports 5, 10, or 12 seconds. Video Extend can append additional duration to an existing clip. For longer sequences, chain Video Extend calls.

**Can I generate video with sound?**

Video generation nodes produce visual output without baked-in audio. For audio, generate separately via [Audio Nodes](/atlas-ai-studio-overview/node-index/audio-nodes.md) and combine in post-production, or use Lipsync to put generated dialogue on a character portrait directly.

**Why does my video look stylistically different from my reference image?**

Some video generation backends prioritize motion fidelity over strict style adherence. If style consistency matters, use Reference to Video (which accepts multiple references) and the full Text + Image → Video node rather than the simple version.

**Can I use video output as a real-time engine asset?**

Generally no. Video outputs are pre-rendered clips, not interactive content. Use them for cinematics, marketing material, in-engine playback (UI screens, in-world displays), or as reference for traditional animation. For real-time character animation, use [Mesh Nodes](/atlas-ai-studio-overview/node-index/mesh-nodes.md) rigging workflows instead.

**What's the recommended workflow for animated NPC dialogue?**

Generate the speech audio via [Audio Nodes](/atlas-ai-studio-overview/node-index/audio-nodes.md) Text-to-Speech, then feed it into Lipsync along with a character portrait. The Lipsync node syncs mouth movement to the audio. For broader body animation, use the rigging and animation pipeline in [Mesh Nodes](/atlas-ai-studio-overview/node-index/mesh-nodes.md).


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.atlas.design/atlas-ai-studio-overview/node-index/video-nodes.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
