A Large Language Model can drive the Unreal Editor as a teammate, not a script runner.
First-class chat, first-class terminal, first-class tools, first-class wire — engineered for low cost, low latency, and the multi-LLM reality of 2026.
WeiMCP is an editor plugin for Unreal Engine 5.7. It runs an in-process MCP (Model Context Protocol) server inside the editor and exposes 1,200+ first-class tools across 23 categories, covering authoring, materials, animation, world building, GAS, Niagara, sequencer, foliage, landscape, world partition, automation, AI/EQS, source control, and more.
An AI assistant — Claude, GPT, Gemini, Grok, a local model — connects over standard JSON-RPC, sees the catalog, and starts working. Inside the editor itself you also get a chat panel, an embedded Claude Code Terminal, a job queue, and a status bar that reports live state. It's the editor with an AI seat at the table, not a chat window glued to the side.
The wire stack and the tool runtime are the part that took the engineering. Brief modes, deflate compression, schema trim, HTTP 304/ETag, auto-summarizer, result stash, and a split-mode worker pattern push per-session cost 3.18× lower than a naive MCP server and 2.04× lower than the typical Python-wrapper pattern, while keeping warm latency around 46 ms.
Everything below ships in the plugin. No external service, no companion process for the basics, no vendor lock — the LLM provider is swappable at runtime.
Authoring (actors, components, blueprints, materials, animations, sequences), pipeline (import, datatables, source control), world (landscape, foliage, world partition, streaming), systems (GAS, Niagara, AI/EQS, behavior trees, state trees), editor utilities (widgets, EUWs, MVVM, automation). Every tool runs inside an editor undo transaction.
Talk to the active LLM right in the editor — no Alt-Tab to a browser, no copy-paste loop. Conversation context is local, history persists per project, and the panel can hand off prompts directly into the tool layer.
The Claude Code CLI runs inside the editor as a real terminal tab. The same agent that knows your codebase can now also drive the editor through the MCP socket — one assistant, two surfaces, zero context switch.
Anthropic (Haiku / Sonnet / Opus), OpenAI (GPT-5.4), Google (Gemini 3.1 Pro/Flash), xAI (Grok 4), DeepSeek (V3.2), and local Ollama all work. Swap providers at runtime — no recompile, no re-install. AnthropicCLI mode routes through a Claude Max subscription for a flat-rate ceiling.
The job queue panel runs tool sequences on a timer, on save, on package, or on custom editor events. Drop in a recipe — “every Monday rebuild HLODs and post a thumbnail diff” — and let it execute unattended. Composes any of the 1,200+ tools.
The MCP server walks an offset window if its preferred port is in use, then publishes the bound port to the editor status bar. Two open projects on one machine, both with WeiMCP active, both addressable — the way the editor itself handles multi-instance.
Tools advertise their risk class (read / write / mutate-asset / mutate-source-control). The plugin can block, prompt, or auto-allow based on user policy. Source-control pushes, project-wide renames, and asset deletes default to prompt-on-call.
Streamable HTTP, JSON-RPC 2.0, the published MCP 2.0 wire format. Anything that speaks MCP works: the Anthropic SDK, the Claude Code CLI, OpenAI's tool-use, Continue.dev, custom agents. POST /mcp is all there is.
The initialize response carries the project name, root path, and engine version. No more “am I talking to GASP or to Meatbags?” — the agent knows on connect, and can route accordingly when several projects are open.
Bound port, active provider, request count, last error, queue depth — all visible in the editor status bar. Click through for a session log of every tool call, ETag hit, and rate-limit decision.
Heavy responses are stashed by content hash. The agent can ask for the next page or the verbose follow-up without paying the original execution cost a second time.
Module type is UncookedOnly. The MCP server, the chat panel, the terminal, the job queue — none of it is in the packaged game. WeiMCP is a workshop tool, not a runtime dependency.
The reason WeiMCP costs less to run than a naive MCP server isn’t one big trick — it’s seven small disciplines compounding through the request stack. Each was measured against the live tools/list payload and a representative session.
tools/list → 16,317 bytes deflated, transparent to the client
| Mechanism | What it does | Where the savings come from |
|---|---|---|
| Brief mode default | Every list-style tool returns a compact summary first (counts, names, paths). The agent asks for verbose only when it actually needs the detail. | Cuts typical response payload by 60-95%, which directly cuts input-token cost on the next agent turn (cached-in or not). |
| Auto-summarizer | When a tool has to return a long list anyway (every actor in a level, every material in a project), the runtime collapses it into a structured digest with sample rows. | 63.5× reduction on the heavy lists. The agent gets shape, distribution, and exemplars — usually enough to decide the next call. |
| Deflate wire compression | HTTP-level Content-Encoding: deflate on every response over a threshold. Negotiated transparently. |
76.3% on the worst-case tools/list (68,820 → 16,317 bytes). Bandwidth cost matters for hosted editors and remote-dev rigs. |
| HTTP 304 / ETag | tools/list ships a strong ETag. If the catalog hasn’t changed since last fetch, the server returns 304 with no body. |
Zero-byte response on every reconnect after the first. Spec-compliant clients (and most modern MCP clients) skip the 16 KB re-download entirely. |
| Schema-emitter trim | Tool input schemas omit empty properties:{} and required:[] blocks at emit time. |
Shaves ~6% off the catalog payload — small per-tool, real across 1,200+. |
| Result stash | Heavy responses are content-hashed and held server-side for a short window. | The agent can ask for “show me page 2” or “the full version of that” without re-running the tool, which is often the most expensive part. |
| Split-mode worker | Brief responses run inline on the worker thread; only verbose runs marshal to the game thread for full editor-API access. | The cheap path stays cheap and never blocks the editor frame. Verbose gets the full editor surface when it actually needs it. |
| 16 KiB cap on tool output | Hard ceiling on any single tool result. Over the cap, the runtime trims and inserts a stash handle. | Predictable upper bound on input-token spend per turn — the price-tag never surprises you. |
Token cost compounds across a workday. A studio running an AI-pair-programming workflow makes thousands of MCP calls per developer per week. Cutting payload at every layer of the stack — not just at the model — is how you keep the bill in the same band as the engineer’s coffee budget instead of their salary.
Live measurements off a current build, talking to a running UE 5.7 editor over loopback. Numbers are cold-start and warm-cache so you see the steady-state floor, not just the optimistic one.
Throughput is bounded by the editor’s game-thread rate, which is exactly the right place for it to be bounded. The server itself can handle hundreds of concurrent brief calls per second; verbose calls queue against the editor frame so they never starve other editor work.
All numbers below come from four progressively-deeper cost analyses run against the same plugin against the same editor, with current-as-of-2026 provider pricing. References at the bottom of the section.
| Provider | Model | Input $/MTok | Output $/MTok | WeiMCP / session | Naive MCP / session | Savings |
|---|---|---|---|---|---|---|
| Anthropic | Haiku 4.5 | $1.00 | $5.00 | $0.0416 | $0.132 | 3.17× |
| Anthropic | Sonnet 4.6 | $3.00 | $15.00 | $0.1247 | $0.397 | 3.18× |
| Anthropic | Opus 4.7 | $5.00 | $25.00 | $0.2078 | $0.661 | 3.18× |
| OpenAI | GPT-5.5 | $5.00 | $30.00 | $0.2478 | $0.700 | 2.82× |
| Gemini 3.1 Pro | $2.00 | $12.00 | $0.0931 | $0.272 | 2.92× | |
| Gemini 3.1 Flash | $0.30 | $2.50 | $0.0181 | $0.054 | 2.98× | |
| xAI | Grok 4 | $3.00 | $15.00 | $0.1374 | $0.397 | 2.89× |
| DeepSeek | V3.2 | $0.14 | $1.10 | $0.0083 | $0.024 | 2.89× |
| Anthropic | Claude Max sub | flat $200/mo subscription, ~hundreds of sessions | ~$0.00 | n/a | unbounded | |
Per-session figures assume warm prompt-cache (10% effective input rate) — the discount the WeiMCP wire stack is specifically tuned to land. GPT-5.5 cache discount is comparable to Anthropic’s; output-token cost is what compresses the savings ratio relative to Sonnet.
| Workflow | Sessions / year | Sonnet baseline | Opus baseline | Haiku baseline |
|---|---|---|---|---|
| Solo developer | ~30,000 | $8,208 | $10,476 | $6,084 |
| Pro / small team (~5 seats) | ~150,000 | $50,396 | $62,340 | $39,720 |
| Automation / agentic farm | ~2,000,000 | $697,975 | $921,400 | $475,200 |
Take Sonnet 4.6 as the realistic mid-line model. WeiMCP costs $0.1247 per session warm; a naive Python-wrapper MCP costs $0.397 — a savings of $0.272 per session. That is the only number doing work below.
| Profile | Sessions / day | Daily savings | Weekly savings | Annualized |
|---|---|---|---|---|
| Solo developer, light AI use | ~30 | $8.16 | $57 | $2,975 |
| Solo developer, daily AI pair-programmer | ~80 | $21.76 | $152 | $8,208 |
| Five-seat studio, normal AI workflow | ~400 | $108.80 | $762 | $50,396 |
| Automation farm, 24/7 agentic work | ~5,500 | $1,496 | $10,470 | $697,975 |
Most paid editor plugins in this category sell in the $400-$800 band. Match WeiMCP’s feature surface against any plugin in that band, and the token savings alone retire the purchase price inside the first month for a busy solo developer, inside the first week for a five-seat studio, and inside the first day for an automation farm. That is before you count the latency, the embedded terminal, the chat panel, the job queue, or the multi-LLM swap — all of which a Python-wrapper MCP would not give you at any price.
Four payback analyses were produced across the development of WeiMCP. Each measured the same plugin against the same editor; what changed is the framing question.
UncookedOnlyAnything that speaks MCP works without modification — the Anthropic SDK, Claude Code CLI, OpenAI’s tool-use, Continue.dev, custom agents, the upcoming Linux Foundation MCP runtimes. There is no proprietary client, no SaaS dependency, no phone-home.