Claude Code cannot generate images on its own. It is a text coding agent, so when you ask it for a hero image, an OG card, or a placeholder asset, it can write the markup that references the file but it cannot create the file. The usual workaround is to stop, open a separate image tool, generate something, download it, drag it into your repo, and pick up where you left off. That context switch is small but it happens constantly, and it pulls you out of the flow that made the agent useful in the first place.
The fix is to give the agent a command it can run itself. The Masonry CLI generates images and video from the terminal across 50+ models, which means Claude Code (or any coding agent that can run shell commands) can produce a real asset mid-session and keep going. Here is the whole setup.
Quick answer
# install npx @masonryai/cli # connect your account (opens a link) masonry login # generate an image masonry image "a minimalist mountain logo, flat vector" --output logo.png # generate a video masonry video "slow dolly over a misty forest at dawn" --aspect 16:9
That is the core of it. Two commands, masonry image and masonry video, plus flags for the model, aspect ratio, and output path. Everything below is detail and the agent workflow.
Why run image generation from the terminal
A web UI is great for browsing and exploring. A CLI is better the moment generation becomes part of a repeatable workflow, because a command can be copied, edited, scripted, and version controlled. For developers that shows up in obvious places: generating a cover image while you write the blog post, filling a UI with realistic placeholder assets instead of gray boxes, batching OG images for a set of pages, or producing a quick product video for a landing section.
The bigger shift is the agent angle. When the tool is a shell command, an agent that already has shell access can call it without any special integration. You ask Claude Code to build a feature, it scaffolds the page, and when it needs a hero image it runs masonry image and references the result, all in one pass. No tab switch, no copy paste, no breaking the agent's momentum.
Install and connect
Install and run it with npx, or install it globally:
npx @masonryai/cli
# or
npm install -g @masonryai/cliThen connect your Masonry account. masonry login opens a link in your browser to authorize the CLI, and after that the credentials are stored locally so you do not pass keys around on the command line.
The two commands
Image generation takes a prompt and saves a file:
masonry image "neon cyberpunk street at night" # → Using model: imagen-4.0-generate-001 # → Generating image... # ✓ Saved to cyberpunk-street.png
Video works the same way:
masonry video "ocean waves at golden hour" # → Using model: veo-3.1-generate-preview # → Generating video... # ✓ Saved to ocean-waves.mp4
You can pin a specific model, set the aspect ratio and dimensions, and choose the output path:
masonry image "studio product shot of a frosted glass bottle" --model flux-2-pro --aspect 1:1 --output bottle.png
And you can animate an existing image instead of starting from text, which is how you turn a static render into a short motion clip:
masonry video --image ./bottle.png # → Input: ./bottle.png # → Using model: kling-v2-6-pro-i2v # ✓ Saved to bottle-animated.mp4
Run masonry --help for the full flag list.
Using it from Claude Code
This is the part that matters for an agent workflow. Because masonry is a normal command, you do not need a plugin. In a Claude Code session you can just ask for the asset in plain language, and the agent runs the command for you:
"Generate a 16:9 hero image of a dark control room with glowing dashboards and save it to public/hero.png, then reference it in the landing section."
Claude Code runs masonry image "..." --aspect 16:9 --output public/hero.png, the file lands in your repo, and it wires it into the component in the same turn. The asset and the code stay in sync because they were produced together.
For a more permanent setup, wrap the common calls in a project skill or a slash command so the agent reaches for image generation automatically whenever a task needs a visual, rather than you spelling out the command each time.
Why 50+ models instead of one
Most image CLIs are wired to a single model. Masonry exposes a catalog (Veo 3, FLUX, Imagen 4, GPT Image, Nano Banana, Kling, and more) behind the same two commands, and that matters because no single model is best at everything. One is stronger at legible text in a marketing mockup, another at photoreal product lighting, another at fast cheap iterations while you are still exploring. Swapping is a flag, not a new tool, so you can match the model to the task without rebuilding your workflow.
Honest notes
A few things worth knowing before you wire this into a pipeline:
- It needs an account and credits. This is not an unlimited free local model, it is a hosted generation service, so generations draw down balance. For most developer use (a handful of assets per project) that is a non-issue, but if you plan to batch thousands of images, check the cost first.
- Generation is a network call, so it is not instant and it needs connectivity. Video especially takes longer than image.
- Treat generated marketing or product imagery the way you would any AI image: check it before you ship it, especially anything with text or a real product in it.
The bottom line
Claude Code is a strong coding agent that simply lacks an image model. The Masonry CLI fills that gap with two commands and no integration work, so the agent can generate images and video the same way it runs tests or installs a package, inside the terminal, in the same session, across whichever model fits the job. If you have ever broken your flow to go make an image by hand, this is the part you can stop doing.
Install it with npx @masonryai/cli and try one generation. See the full command reference for everything else.
