Back to Blog
How Our AI Agent Actually Works: Intent, Modes, and Automatic Model Selection
AI Agents

How Our AI Agent Actually Works: Intent, Modes, and Automatic Model Selection

June 6, 20267 min read

Go behind the scenes of the AI Agent β€” how it reads your intent, picks the right mode and model, enhances your prompt, and chains multi-step plans for you.

From a Sentence to a Finished Image

The AI Agent lets you skip the menus. You describe what you want in plain language and it handles model choice, settings, and execution. Here is what actually happens under the hood between your message and the finished result.

Step 1: Understanding Your Intent

The agent reads your message together with its context β€” any images you uploaded and the conversation so far β€” to work out what you are really asking for, not just the literal words.

Step 2: Detecting the Mode

Next it picks the right mode: create a brand-new image, transform or edit an existing one, upscale, generate a video, or animate a still. If it thinks you want to switch modes, it tells you first so you are never surprised.

Step 3: Choosing the Model

This is where the agent earns its keep. It routes your request to the model best suited for the job:

  • Text inside the image, like posters or signage β€” GPT Image 2 or Flux 2 Flex
  • Highest quality or a hero shot β€” Flux 2 Max or Imagen 4 Ultra
  • Speed and high volume β€” Flux 2 Klein or Nano Banana
  • Real-world or current subjects β€” Flux 2 Max with grounding web search
  • Editing or combining references β€” a Flux 2 Edit model, up to eight images
  • Video from a still β€” Veo, Kling, or Seedance image-to-video

Step 4: Enhancing Your Prompt

Before generating, the agent automatically translates non-English prompts to English and enriches them with style, lighting, composition, and mood detail. A three-word request becomes a precise brief the model can actually follow.

Step 5: Multi-Step Plans

For bigger requests the agent builds a plan and executes it step by step β€” for example, generate an image and then upscale it, or produce a set of variations β€” showing progress as it goes instead of making you run each step yourself.

Multi-Reference: Consistency Across Scenes

When you need the same character in different scenes, a person dropped into a new background, or a product placed into a mockup, the agent passes multiple reference images to a Flux 2 edit model and keeps the key elements consistent.

You Stay in Control

Every decision is transparent. You can see which model the agent chose, override it whenever you like, or step into the Studio for full manual control. The agent is there to remove busywork, not to take away your choices.

See it in action

Chat with the AI Agent
AI agenthow the AI agent worksautomatic model selectionprompt enhancementmulti-referenceAI workflowAI assistant