# Utility Nodes

## Utility Nodes

Utility Nodes provide general-purpose functions that support all Atlas workflows.\
They are not tied to image or mesh generation; instead, they handle documents, text processing, arrays, and lightweight LLM operations.

Utility Nodes are essential for orchestrating complex workflows, preparing inputs, and structuring data for generation nodes.

### Document Nodes <a href="#document-nodes" id="document-nodes"></a>

#### Extract Images From Document <a href="#extract-document-images" id="extract-document-images"></a>

This node processes an uploaded **PDF document** and extracts all images embedded within it.

* Input: **PDF file** (e.g., a Game Design Document)
* Output: **Image Array**
* Each extracted image is returned as an element in the array.

You can use the resulting array with:

* **Find Images by Description** node to pull specific images
* **Image Generation nodes** that accept image arrays
* **Break Images Array** to isolate specific images

#### Extract Text From Document <a href="#extract-document-text" id="extract-document-text"></a>

Extracts text content from a **PDF file** and separates it into three categories:

* **Visual Descriptions** — descriptions of scenes, objects, characters
* **Relevant Other Text** — supporting information that may assist generation
* **Filtered / Irrelevant Text** — removed noise, metadata, or non-useful content

This separation is useful when you want to feed only contextually relevant text into your generation nodes or LLM prompts.

### Split Character Sheet

Detects and extracts individual views—such as front, side, and back—from a single character design sheet.

* Automatically parses a design sheet to isolate different character poses into separate image outputs.
* Works best with horizontal, vertical, or grid layouts where poses are clearly separated by empty space.
* Outputs: Specific views for Front, Right, Left, and Back, plus an array of any "Other" detected angles (like 3/4 views).
* Ideal for: Preparing character sheets for the Multi-View -> 3D node or organizing concept art.

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2FEF7zOY1HtsrEegdzIpgb%2Fimage.png?alt=media&#x26;token=cbabcd87-e533-4cb6-b5e2-ef82a22f1542" alt="" width="563"><figcaption></figcaption></figure>

### Array Management Nodes <a href="#array-management-nodes" id="array-management-nodes"></a>

#### Break Images Array <a href="#break-images-array" id="break-images-array"></a>

Takes an **image array** and outputs up to **four images separately**.

* Change **Start Index** to control which images are extracted
* Useful when only specific images from a PDF or batch are needed
* Ideal for directing selected images to multimodal or generation nodes

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2FvoneLzoE5frpwwVZZGSq%2Fimage.png?alt=media&#x26;token=48c947a4-b8bf-416c-969a-13dfe272f2ad" alt="" width="563"><figcaption></figcaption></figure>

#### Create Images List <a href="#create-images-array" id="create-images-array"></a>

Collects up to **8 individual images** and groups them into a **single array output**.

Use this when:

* Preparing multi-image inputs for generation nodes
* Organizing design references into one structured output
* Combining image sets extracted from different parts of the workflow

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2FCL7cCsYyGYBRaldqFld7%2Fimage.png?alt=media&#x26;token=022b3680-73a3-41a0-8ab9-65e0499a9571" alt="" width="563"><figcaption></figcaption></figure>

#### Concatenate Image Arrays <a href="#concatenate-images-array" id="concatenate-images-array"></a>

Combines up to **4 image arrays** into one unified array.

* Ensures multiple sources of images appear in a single list
* Output is a single array containing all items in order

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2FIFhtqxRLhZkBUSCAvi1v%2Fimage.png?alt=media&#x26;token=2ba7d076-a4c8-41e7-8fd9-94066cd3b041" alt="" width="563"><figcaption></figcaption></figure>

### Text & LLM Nodes <a href="#text--llm-nodes" id="text--llm-nodes"></a>

#### Combine Text <a href="#text-concatenate" id="text-concatenate"></a>

Merges multiple text inputs into one unified string.

* Helps combine text blocks before sending them into generation nodes
* Useful when you want to enforce a **fixed prefix or order**\
  (e.g., system constraints + extracted text → final generation prompt)

Typical use case:

* Merge a “generation instruction” with extracted document text
* Feed the combined result into an image or 3D generation node

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2F1kG03UzfBIPQyScCyXDY%2Fimage.png?alt=media&#x26;token=8cf56f9e-2ab4-4395-8d4b-7841c037aed4" alt="" width="563"><figcaption></figcaption></figure>

#### Text Generation (LLM) <a href="#simple-llm-call" id="simple-llm-call"></a>

LLM tool for controlled text generation.

* Inputs:
  * **System Prompt** — defines the agent’s role and behavior
  * **Text Input** — any context or content to transform
* Output: **Single text result**

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2F7kb0zKul20y0FAMwBHhX%2Fimage.png?alt=media&#x26;token=f4b7126e-43fe-4aa8-bfa4-43acb0042e07" alt="" width="563"><figcaption></figcaption></figure>

Example use case:

You can input a concept art image and use **Describe Image** to extract the existing character list. A Text Generation (LLM) can then be instructed to return only the first character from that list. You may add an additional text input that defines how the multimodal node should isolate the selected character for 3D modeling. Using Combine Text, you merge the LLM-generated character description with your isolation instruction and feed the combined text directly into the multimodal node. This creates a controlled, image-conditioned prompt for generating a clean extraction of a single character ready for modeling.

Another example is providing an image of a full scene and using **Describe Image** to obtain a structured description of its spatial layout. You can then instruct the **Text Generation (LLM)** to act as a spatial-logic prompt generator that transforms these descriptions into a precise prompt for generating a 2D top-down plan of the same scene. By passing this prompt into a multimodal node, you obtain a clean plan abstraction. With this method, any input image produces a custom, image-specific plan prompt through the combined use of Describe Image, LLM processing, and Combine Text.

### Simple Text Render

Generates a 2D image from a text string with automatic height adjustment to fit the content.

* Allows for basic typography control, including font selection, size, line spacing, and color customization.
* The output image width is fixed by the user, while the height scales dynamically based on how much text is provided.
* Inputs: Input Text, Font Name (Arial/Nunito), Image Width, Font Size, and Background/Font Colors.
* Ideal for: Creating labels, UI elements, or adding textual descriptions directly into your image processing pipeline.

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2FvYsus7Yza5AaCsBnypqq%2Fimage.png?alt=media&#x26;token=fc5664c7-af5b-4c60-8e01-c83723b0f483" alt="" width="563"><figcaption></figcaption></figure>

### Concatenate Images

Combines multiple images into a single image file by joining them side-by-side or top-to-bottom.

* In Horizontal mode, all images are automatically scaled to match the height of the first image in the array.
* In Vertical mode, all images are scaled to match the width of the first image in the array to maintain a uniform column.
* Inputs: Input Images (Array), Direction (Horizontal/Vertical).
* Ideal for: Creating comparison grids, before-and-after montages, or reassembling split character sheets

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2FjF8MRUIlhg7FlKo7SP8c%2Fimage.png?alt=media&#x26;token=3e7a713b-0187-4250-aaba-30f210ec4b85" alt="" width="563"><figcaption></figcaption></figure>

<figure><img src="https://3654894688-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FR7boiMixMhR4q36Ns33Y%2Fuploads%2FS1wlqMofNmSc4eAN1iHZ%2Fimage.png?alt=media&#x26;token=54698f38-949e-422a-be20-d55efac8fbbd" alt="" width="563"><figcaption></figcaption></figure>

Utility Nodes allow these types of modular logic chains ; extracting structured information, refining or transforming it, and recombining it to produce workflows that adapt automatically to the input image while remaining deterministic and reusable across the Atlas platform.
