AI Models - Masonry | Nano Banana, FLUX, GPT Image, Imagen & More

Nano Banana 2

Preview of Gemini 3.1 Flash image generation optimized for price-performance balance with text-to-image and image mixing (supports up to 14 input images).

Text to ImageRemixInpaintOutpaintStyle Transfer

Seedance 1.5 Pro

ByteDance's latest most powerful video model yet

Text to VideoImage to Video

GPT Image 2

OpenAI's GPT Image 2 with native reasoning, up to 4K output, and multi-image consistency across a batch.

Text to ImageInpaint

Nano Banana

Fast Gemini 2.5 Flash image variant for text-to-image generation and image mixing (supports up to 3 input images).

Text to ImageRemixInpaintOutpaintStyle Transfer

Nano Banana Pro

Preview of Gemini 3 Pro image generation for text-to-image and image mixing (supports up to 14 input images).

Text to ImageRemixInpaintOutpaintStyle Transfer

FLUX.2 Dev

Developer-focused FLUX.2 variant with lower latency and go_fast toggle.

Text to Image

FLUX 1.1 Pro

Professional FLUX 1.1 model with enhanced quality and capabilities.

Text to Image

FLUX.2 Pro

Professional FLUX.2 model with higher quality, multi-image conditioning, and up to 4MP outputs.

Text to Image

Ideogram V3 Quality

The highest quality Ideogram v3 model. v3 creates images with stunning realism, creative designs, and consistent styles

Text to Image

Seedance 1 Lite

ByteDance's Seedance 1 Lite model for cost-effective prompt or image conditioned video generation.

Text to VideoImage to Video

Kling 3.0

Kling 3.0 is an advance image-to-video AI model featuring extended duration support (3-15 seconds), start/end frame control for precise scene transitions, native audio generation in Chinese and English, and multi-prompt capabilities for creating multi-shot videos.

Text to VideoImage to Video

FLUX Kontext Max

Advanced FLUX model for image generation and editing with reference image support for context and composition guidance.

Text to ImageStyle Transfer

FLUX.2 Flex

Flexible FLUX.2 variant optimized for creative exploration with tunable steps and guidance.

Text to Image

FLUX.2 Klein 4B Base

Un-distilled FLUX.2 Klein 4B base model optimized for fine-tuning and multi-reference workflows.

Text to ImageRemixStyle Transfer

FLUX.2 Klein 9B Base

Un-distilled FLUX.2 Klein foundation model for flexible text-to-image and multi-reference workflows.

Text to ImageRemixStyle Transfer

GPT Image 1.5

OpenAI's GPT Image 1.5 model for image generation and edits (supports up to 10 input images).

Text to ImageRemix

Grok Imagine Image

xAI Grok Imagine text-to-image generation with aspect ratio and 1k/2k resolution controls.

Text to Image

Grok Imagine Image Edit

xAI Grok Imagine image editing: edit up to 3 reference images with a text prompt, aspect ratio, and 1k/2k resolution controls.

Remix

Grok Imagine Video 1.5

xAI Grok Imagine 1.5 image-to-video: animate a source image with a text prompt at 480p or 720p.

Image to Video

Ideogram V3 Turbo

Turbo is the fastest and cheapest Ideogram v3. v3 creates images with stunning realism, creative designs, and consistent styles

Text to Image

Ideogram V4

Ideogram's latest text-to-image model. Best-in-class text rendering for posters, logos, and signage, with fine detail and strong creative control.

Text to Image

Imagen 4

Google's flagship Imagen 4 model for high-quality image generation with improved text rendering

Text to Image

Kling 2.1

Kwaivgi's Kling v2.1 standard mode producing 720p 24fps video from a prompt and reference frame.

Image to Video

Kling 3.0 Standard

Kling 3.0 standard is an advance image-to-video AI model featuring extended duration support (3-15 seconds), start/end frame control for precise scene transitions, native audio generation in Chinese and English, and multi-prompt capabilities for creating multi-shot videos.

Text to VideoImage to Video

Kling O1

Kling O1 first-frame-to-last-frame video generator with dual keyframe support for precise motion control and transitions.

Image to Video

Kling O1 Reference (Character Lock)

Kuaishou's Kling O1 reference-to-video: lock a character's identity from multiple reference images (visual DNA) and generate a clip from a prompt, via fal.ai.

Image to Video

Kling Pro 2.1

Kwaivgi's Kling v2.1 pro mode offering 1080p 24fps output with optional end-frame guidance.

Image to Video

Kling v1 Camera Director

Kuaishou's Kling v1 text-to-video with pre-baked cinematic camera templates (dolly/crane, orbit, pan, tilt, roll, zoom) via fal.ai. Camera control is a Kling v1-era feature; newer Kling tiers omit it.

Text to Video

Kling v1.5 Motion Brush

Kuaishou's Kling v1.5 Pro image-to-video with motion-brush trajectory pathing — paint per-region motion paths over a start image (dynamic masks + a static hold region) via fal.ai.

Image to Video

Kling v2.5 Turbo Pro

Kwaivgi's Kling v2.5 Turbo Pro model for prompt-based or image-guided video generation.

Text to VideoImage to Video

Kling v2.6 Pro

Kling v2.6 Pro Image-to-Video model with improved visual quality, motion consistency, and native audio generation support.

Image to Video

Masonry Magic Layers

Decomposes a single image into multiple editable RGBA layers (foreground, background, text, and individual elements) in one pass, so each piece can be moved and edited independently on the canvas.

Remix

Minimax Hailuo 02

Minimax's Hailuo 02 standard tier supporting 512p and 1080p output.

Text to VideoImage to Video

Qwen Image

High-quality text-to-image model from Qwen with support for multiple canvas dimensions and LoRA weights.

Text to Image

Qwen Image Edit Plus

Qwen's enhanced image editing model supporting multi-image conditioning and rich prompt controls.

Remix

Seedance 1 Pro

ByteDance's Seedance 1 Pro model via BytePlus ModelArk API with multi-shot narrative capabilities and cinematic aesthetics.

Text to VideoImage to Video

Seedance 2.0

ByteDance's Seedance 2.0 model via fal.ai for cinematic text-to-video and image-to-video generation with native audio.

Text to VideoImage to Video

Seedance 2.0 Fast

ByteDance's Seedance 2.0 fast endpoints via fal.ai, optimized for lower latency and cost.

Text to VideoImage to Video

SeedDream 4

ByteDance's SeedDream 4 model for high-quality text-to-image and image-to-image generation with support for up to 4K resolution.

Text to ImageRemixStyle Transfer

SeedDream 4.5

ByteDance's SeedDream 4.5 model for high-quality text-to-image and image-to-image generation with improved spatial understanding and world knowledge, supporting up to 4K resolution.

Text to ImageRemixStyle Transfer

Veo 3

Google DeepMind's Veo 3 text-to-video model delivered through Vertex AI.

Text to Video

Veo 3 Fast

Veo 3 Fast delivers rapid text-to-video renders optimized for iteration via Vertex AI.

Text to VideoImage to Video

Veo 3.1

Preview release of Veo 3.1 supporting enhanced text-to-video and image-to-video generation on Vertex AI.

Text to VideoImage to Video

Veo 3.1 Fast

Veo 3.1 Fast Preview delivers rapid preview renders for text-to-video and image-to-video via Vertex AI.

Text to VideoImage to Video

Veo 3.1 Lite Preview

Veo 3.1 Lite Preview offers lightweight, cost-efficient text-to-video and image-to-video generation via Vertex AI.

Text to VideoImage to Video

WAN 2.5 (Image-to-Video)

WAN Video 2.5 image-to-video generation with 5–10s clips at 480p/720p/1080p.

Image to Video

Every model, one canvas

Nano Banana 2

Seedance 1.5 Pro

GPT Image 2

Nano Banana

Nano Banana Pro

FLUX.2 Dev

FLUX 1.1 Pro

FLUX.2 Pro

Ideogram V3 Quality

Seedance 1 Lite

Kling 3.0

FLUX Kontext Max

FLUX.2 Flex

FLUX.2 Klein 4B Base

FLUX.2 Klein 9B Base

GPT Image 1.5

Grok Imagine Image

Grok Imagine Image Edit

Grok Imagine Video 1.5

Ideogram V3 Turbo

Ideogram V4

Imagen 4

Kling 2.1

Kling 3.0 Standard

Kling O1

Kling O1 Reference (Character Lock)

Kling Pro 2.1

Kling v1 Camera Director

Kling v1.5 Motion Brush

Kling v2.5 Turbo Pro

Kling v2.6 Pro

Masonry Magic Layers

Minimax Hailuo 02

Qwen Image

Qwen Image Edit Plus

Seedance 1 Pro

Seedance 2.0

Seedance 2.0 Fast

SeedDream 4

SeedDream 4.5

Veo 3

Veo 3 Fast

Veo 3.1

Veo 3.1 Fast

Veo 3.1 Lite Preview

WAN 2.5 (Image-to-Video)