Claude Code cannot generate images on its own. It is a text coding agent, so it can reference an image file but cannot create one. The fix is to give it a tool it can call, and there are three kinds: command-line tools, open-source skills, and MCP servers. They differ mostly on cost, model selection, and how much setup they need. Here is an honest comparison so you can pick by the job.
Quick answer: which tool for which job
- You want the widest model selection and business-grade output: the Masonry CLI. 50+ image and video models behind two commands, plus a Claude Code skill so the agent reaches for it automatically. Paid (account and credits).
- You want free and you are already on Vercel: Vercel AI CLI.
ai imageandai videowith a companion Claude Code skill, and every Vercel team including the free Hobby plan gets monthly AI Gateway credits. - You want free and open-source, with your own API key: community skills and MCP servers like claude-image-gen (Gemini) or an Azure OpenAI / FLUX MCP server. You self-manage keys and cost.
- You want an in-conversation creative suite (generate, upscale, remove background): an MCP server like Pixa that bundles several media operations.
The tools
Command-line tools
Masonry CLI. Generates images and video from the terminal across 50+ models (Nano Banana 2, GPT Image 2, FLUX.2, Seedream, Imagen, Ideogram for images; Veo 3.1, Kling, Seedance for video). You switch models with a flag, and masonry skill install adds a Claude Code skill so the agent calls it on its own. Best when you want to match the model to the job (legible packaging text, photoreal product lighting, product-locked vs cinematic video) rather than being tied to one model. It is built for business creative, product photography and ads. Paid: it is a hosted service that needs an account and credits. See the setup guide.
Vercel AI CLI. A thin command-line wrapper over the Vercel AI Gateway with ai text, ai image, and ai video, plus a companion Claude Code skill that lets the agent call it in plain language. It defaults to GPT Image 2 for images and Seedance for video. The draw is cost and ecosystem fit: every Vercel team, including the free Hobby tier, gets monthly AI Gateway credits, so it is close to free to start if you already deploy on Vercel.
Open-source skills and MCP servers
These are the free, bring-your-own-key options, and they are why GitHub is full of "image generation in Claude Code" repos.
- Community skills such as claude-image-gen wire a single provider (often Google Gemini) into Claude Code as a skill, sometimes with an MCP server in the same package. Free to run; you supply the API key.
- MCP servers such as an Azure OpenAI / FLUX server generate and insert image assets during UI coding, saving files into your repo. Good if you want a structured tool-call integration and are comfortable configuring an MCP server.
- Creative-suite MCP servers such as Pixa go beyond generation to upscaling, background removal, and object erasure, all inside the Claude conversation.
The tradeoff across all of these is breadth and maintenance: most are wired to one or a few models, and you own the setup and the keys.
Honest comparison
| Tool | Type | Models | Cost | Best for |
|---|---|---|---|---|
| Masonry CLI | CLI + skill | 50+ image and video | Paid (account + credits) | Most model choice, business output |
| Vercel AI CLI | CLI + skill | Gateway selection (2 defaults) | Free tier credits, then usage | Free start, Vercel ecosystem |
| claude-image-gen (OSS) | Skill + MCP | Gemini (single provider) | Free, bring your own key | Free, simple, one provider |
| Azure OpenAI / FLUX MCP | MCP server | gpt-image-1, FLUX 1.1 Pro | Free, bring your own key | Structured MCP integration |
| Pixa (and similar) | MCP server | Multiple, plus edit tools | Varies | In-conversation creative suite |
How to choose
The deciding question is not "which is best" but "what do you need more of."
- Cost first: open-source skills or Vercel AI CLI's free credits.
- Model selection first: the Masonry CLI, by a wide margin (50+ versus one or a few).
- Business product and ad output: the Masonry CLI, which is built for that and pairs the model breadth with a workflow for turning a product photo into studio, lifestyle, and ad shots.
- Editing operations in the same flow: a creative-suite MCP server.
If you are picking models for product work specifically, the image model roundup and the video model roundup cover which model wins which job.
FAQ
Can Claude Code generate images on its own? No. Claude Code is a text coding agent. It can reference an image file but cannot create one. You add image generation with a tool it can call: a command-line tool, an open-source skill, or an MCP server.
What is the best free way to generate images in Claude Code? For free, look at the open-source skills and MCP servers (you bring your own API key) or Vercel AI CLI, which includes monthly AI Gateway credits on its free tier. These are the lowest-cost paths.
What gives the most model choice? The Masonry CLI exposes 50+ image and video models behind two commands, more than the single-model or few-model alternatives, and installs a Claude Code skill so the agent calls it automatically.
Do I need an MCP server, or is a CLI enough? A CLI is enough. Because a CLI is a normal shell command, any agent with shell access can call it with no special integration. MCP servers are an alternative for tools that expose structured tool calls; both work.
The bottom line
Claude Code does not generate images, but adding the capability is a small decision tree. Want free? Start with an open-source skill or Vercel AI CLI. Want the most models and business-grade product output? Use the Masonry CLI. Either way, your agent can produce an asset mid-task and keep coding instead of switching to a separate app.


