MAESTROBOT
MaestroBot Design
Link to implementation repo.
MaestroBot is a local Linux user-scoped agent service built from three real subsystems:
- Maestro as the programmable agent-loop and state-machine substrate
- Myria as Global Persistent Memory and the durable event substrate
- Go as the host runtime, scheduler, control plane, and tool host
The core design goal is to run a long-lived local agent that behaves like one durable channel-based worker rather than a stateless chat wrapper.
Table of Contents
- 1. Core model
- 2. Main responsibilities
- 3. Architectural split
- 4. Deployment shape
- 5. Storage layout
- 6. Maestro program model
- 7. Channel runtime model
- 8. Scheduling and residency
- 9. Prompt and action model
- 10. Tool model
- 11. Workspace and VFS model
- 12. Memory integration
- 13. MCP integration
- 14. Control plane
- 15. Failure model
- 16. Current implementation notes
1. Core model
MaestroBot is channel-centric.
The primary runtime unit is the channel. Each channel owns:
- one logical runtime context
- one persistent workspace
- one pending message queue
- one scheduler priority
- one paging identity
- one retained subagent set
The system is meant to preserve work and context across idle periods, restarts, and paging boundaries.
An idle channel is not runnable merely because it has a persisted paging snapshot. The host only reactivates a channel when there is real work: pending inbound input, a due wake condition, or an explicit operator wake.
2. Main responsibilities
MaestroBot is responsible for:
- normalizing local frontend input into channel messages
- scheduling channel work
- retaining and paging channel state
- exposing and dispatching tools
- supervising Myria
- exposing a local control plane over Unix IPC
- managing external MCP servers
MaestroBot is not responsible for:
- owning long-term semantic memory itself
- storing provider state outside its own root
- acting as a remote multi-user network service
3. Architectural split
There are three logical layers.
3.1 Frontend
Responsibilities:
- normalize platform input to internal messages
- deliver outbound runtime messages to the platform
- emit canonical events to Myria through the host runtime
Built-in frontends currently include:
- local CLI
- Telegram Bot API
Outbound frontend design is intentionally layered:
- the runtime owns canonical authored Markdown
- the frontend lowers that Markdown into a platform-safe intermediate form
- the frontend renders and delivers the final platform payload
For Telegram, the lowering path is:
- Markdown input
- Markdown AST
- normalized Telegram-safe IR
- Telegram renderer
The Telegram renderer prefers text + entities, falls back to HTML
parse mode, and finally falls back to plain text chunking.
During an active Telegram-backed channel run, the frontend also emits
sendChatAction(action="typing") until the run sends a reply or
finalizes.
3.2 Runtime
Responsibilities:
- own per-channel queues
- own channel scheduling
- own paging and residency
- own tool dispatch
- own provider calls
- own subagent lifecycle
- act as the kernel for schema execution
3.3 Memory
Responsibilities:
- per-channel local transcript and working context
- global durable event log
- cross-channel and long-horizon retrieval
- snapshot/index build workflows
Each channel is a complete local agent instance. It owns its conversation transcript, structured working-memory notebook, current-run history, workspace, queues, and sleep/wake state. The host derives the recent local context from the channel transcript before prompting.
Myria is supervised as Global Persistent Memory. It receives canonical events from the host and provides auxiliary retrieval when the local channel transcript and notebook are insufficient, stale, or too compressed.
4. Deployment shape
MaestroBot is one binary.
Normal invocation modes:
maestrobot --daemonRuns the daemon directly in the foreground.maestrobot --daemon --debugRuns the daemon with frontend-visible gateway telemetry.maestrobot ...Runs the local control CLI.maestrobot daemon start|stop|status|logsManages asystemd --userservice.
Myria is launched by the daemon as a subprocess.
HTTP is not the main control-plane surface. The control path is Unix socket IPC.
Gateway debug mode is operator telemetry. It may send compact tool-call, sanitized-argument, tool-result, prompt-compaction, cumulative-token, and sleep/wake messages through frontends, but those messages are not channel transcript entries and are not appended to Myria as conversation events.
5. Storage layout
Default root:
~/.maestrobot
The root contains at least:
config.yamlruntime.yamlstate.jsonSOUL.mdmaestro/myria/workspaces/users/paging/transcripts/logs/
5.1 config.yaml
Static operator-authored config.
Contains:
- provider definitions
- concrete model presets
- runtime preset references
- Myria configuration references
- frontend configuration
- external MCP server configuration
- onboarding notice and archive policy
5.2 runtime.yaml
Mutable desired runtime config.
Contains:
- stable internal users
- internal user display names
This file is editable by the user and by the control CLI.
5.3 state.json
Mutable host-owned state.
Contains:
- channels
- unknown identities
- external account to internal user mappings
- retained subagents
- tool server state
- next-id counters
Unknown accounts are not allowed to enter the agent loop. They receive a deterministic onboarding notice from the runtime/frontend gate up to a configured cap, then are ignored until attached to an internal user.
5.4 maestro/
User-editable Maestro agent source and compiled artifact.
This is part of the runtime root on purpose. Users are expected to customize the agent behavior here without patching the repo.
5.5 workspaces/
workspaces/<channel-id>/
Each channel gets one persistent host directory.
5.6 users/
users/<user-id>/profile.md
Holds concise durable user-profile context. Profiles are bounded by config, injected into the prompt for known channel participants, and updated through a host-validated tool. They should contain stable preferences, constraints, communication style, and durable facts only.
Prompt rendering keeps stable identity, operating rules, mode instructions, and durable user profiles ahead of volatile channel state. The OpenRouter adapter sends that stable prefix as a separate system message and places timestamp, current message, run events, and tool observations in the dynamic user message. This preserves Maestro-owned prompt text while giving provider-side prompt caches a stable prefix.
5.7 bin/
Installer-managed runtime binaries.
This includes at least:
bin/maestrocbin/myria
6. Maestro program model
MaestroBot uses Maestro as the programmable agent-loop substrate.
The runtime root contains:
maestro/channel_loop.mstrmaestro/subagent_loop.mstrmaestro/agent_loop.mstro
The repo ships default templates for the .mstr files, but the runtime
does not execute those repo copies directly.
Instead:
maestrobot initseedsroot/maestro/- the installer places
root/bin/maestrocandroot/bin/myria - the daemon compiles
root/maestro/*.mstrinto a valid artifact - channel and subagent execution run from that compiled root artifact
This makes the Maestro layer an operator-facing customization surface.
The Go runtime still owns:
- persistence
- IPC
- provider access
- tool execution
- Myria supervision
But the full agent loop, prompt composition, and explicit state sequencing live in the Maestro program.
The Maestro program also declares the host-provided surface it wants for each state. A state requests:
- valid state transitions
- tool policy, such as
@plan,@workspace,@myria, or an exact tool name - context sections, such as
agent_memory,recent_channel_context, orlast_tool_observation
The Go host treats those declarations as schema-owned policy. It expands
tool groups against the currently available runtime catalog, rejects
unknown required tools, skips explicitly optional tools written as
?tool.name, and injects only the requested context sections. The host
still owns provider calls, validation, tool execution, persistence, and
frontends.
7. Channel runtime model
Per-channel state is one of:
currentactiveidle
Semantics:
currentthe foreground channelactiverunnable or resumable work existsidlesleeping, waiting for new work or a wake condition
Wake conditions are explicit runtime events. The host distinguishes:
user-messagea new human inbound messageautomatic-wakea scheduled wake created by prior channel finalizationruntime-wakean explicit host/operator wake without new human input
The active snapshot retains wake metadata, including the wake source, scheduled wake self-note, next wake time, and the consecutive automatic wake count. A scheduled wake self-note is a private note from the current channel agent to its future run. It should explain what the future agent should remember, inspect, decide, or avoid. Finalization always schedules a future wake. If the only plan is to check whether the user replied, the agent should schedule a long wake and tell its future self not to message the user if nothing changed.
One channel run is one contiguous execution segment from the last finalized boundary to the next finalized boundary.
That same boundary is used by daemon trace streaming.
8. Scheduling and residency
The scheduler is channel-based, not message-based.
Properties:
- one pending queue per channel
- channel-level priority
- priority aging over time
- bounded worker concurrency across channels
- one run at a time per channel
state.json is the authoritative scheduler record. It owns pending
queues, channel state, pause state, priority, wake_at, and
timeout_at. Paging snapshots are execution-context records used to
resume or reconstruct a run; they are not the source of truth for queue
ownership or wake scheduling.
Residency is host-owned:
- channel snapshots are host data, not opaque Maestro VM dumps
- paging happens only at safe boundaries
- LRU-style eviction is used when resident contexts exceed the limit
- one Maestro run executes until it reaches
idle, then the host may page or reschedule the channel later
On startup, the host reconciles these stores before starting workers.
Stale current channels left by a previous process are downgraded to
active if their snapshot has unfinished work, or to idle with a
scheduled fallback wake if no work remains.
9. Prompt and action model
The agent prompt is not platform-specific.
The runtime and Maestro state machine may ask the agent to emit a user-visible Markdown message, but they do not ask it to emit Telegram-specific entities or HTML.
That platform lowering happens only at the frontend boundary.
Prompt construction is schema-owned, but identity is not.
SOUL.md is the exclusive source for the agent’s identity, name, voice,
and standing preferences. The active Maestro program decides where that
identity text is injected, but the program itself should remain
identity-neutral. Runtime context should provide facts and constraints,
not persona.
Each prompt includes:
- identity text loaded from
SOUL.md - operating and current-mode prompt text embedded directly in the active Maestro state
- channel metadata
- participant context
- current message
- bounded local working memory
- valid next transitions
- recent current-run events
- loop warnings derived from repeated action patterns
- relevant queue summaries, with follow-up content hidden until consumed
The host runtime supplies those structured sections as data, but the
prompt layout and wording are embedded in maestro/channel_loop.mstr
and maestro/subagent_loop.mstr.
The local working-memory layer is intentionally modeled as a structured operational notebook rather than a truthful event log. It is designed to hold the agent’s current understanding in compact sections such as user profile, channel facts, active goal, plan, open loops, workspace state, and handoff notes.
The host also retains a shorter-lived current-run-capability view inside the current channel snapshot. That view is inferred from the tools exposed during the current run and exists so the schema can answer capability questions from the broader current run rather than only the current state’s narrowed tool mask.
Each channel is modeled as a complete local agent. The local transcript is the canonical recent conversation source, and the snapshot carries a bounded current-run event stream: user messages, assistant messages, tool calls, tool results, state transitions, wake events, and context updates. The prompt is rendered from the transcript-derived recent context plus the current-run stream, not from a single last-action slot.
Myria remains Global Persistent Memory and the truthful durable event substrate. The agent should prefer the local notebook and recent local transcript context first, and consult Myria only for older, cross-channel, uncertain, stale, or too-compressed recall.
Grounding rules sit above that memory split:
- successful tool results are factual observations
- failed tool results are also factual observations about what did not work
- user-visible replies must not invent successful filesystem, shell, browser, web, or memory results that are not supported by successful tool execution
Each agent step is bounded to exactly one tool/action selection.
Maestro owns the channel run loop itself:
prepareactivates or resumes one channel runplanperforms bounded work on the current channelfinalize-sendemits the user-visible replyidleasks the model to choose timeout, sleep, wake self-note, and priority bookkeeping, then signals the host that the run has reached a safe waiting boundary
The host validates and dispatches the selected tool, but the loop shape and state transitions are encoded in the schema.
The schema controls which tools are visible in each step through a small policy language. Current built-in groups are:
@plan@finalize-send@idle@core@workspace@web@image@browser@subagents@myria@external
Exact tool names may be requested directly. Prefix a tool name with ?
when the schema can use it if available but should continue if that tool
is absent in the current runtime state.
10. Tool model
The tool plane is unified.
10.1 Built-in runtime tools
internal.* tools include:
- queue control
- channel finalization
- VFS reads/writes/search
- user-note writes
- shell execution and interactive process sessions
- browser automation
- web fetch/search
- image metadata and OCR
- subagent control
10.2 Myria tools
myria.* is query-only from the agent’s perspective.
The agent does not append to Myria directly.
Myria tools expose global persistent memory. They are not the normal path for remembering the immediately preceding same-channel exchange.
10.3 External MCP tools
External MCP servers are registered into the same tool plane and exposed alongside built-ins.
11. Workspace and VFS model
Each channel has a host directory, but tools see a mounted VFS view.
Mount layout:
/writable channel workspace/.host-path/<n>read-only mounted hostPATHdirectories
The host runtime owns lease enforcement so overlapping agent activity cannot corrupt the workspace.
Long-running process sessions and browser sessions retain their lease ownership while active.
12. Memory integration
Myria is supervised as a subprocess.
Current default:
- file-backed SQLite for convenience in local deployment and testing
Fuller backend:
- PostgreSQL remains the richer Myria storage mode
The generated Myria config is derived from MaestroBot config.yaml.
In the current MaestroBot version, SQLite is the generated convenience
path.
13. MCP integration
MaestroBot supports MCP in three ways:
- built-in Myria over
stdio - external local MCP servers over
stdio - external remote MCP servers over:
- Streamable HTTP
- legacy HTTP+SSE
The control plane supports:
- explicit stdio registration
- explicit remote registration
- manifest discovery
- manifest import
- enable/disable
- inspect
- removal
Discovery currently recognizes:
mcp.json.mcp.jsonmcp-server.jsonclaude_desktop_config.json
Compatibility probing is real: the daemon connects to the server,
performs initialize, sends notifications/initialized, and probes
tools/list.
14. Control plane
The control plane runs over a Unix socket.
It covers:
- root bootstrap
- daemon lifecycle
- channel management
- chat injection
- workspace inspection
- identity association
- runtime pause/resume
- MCP management
- model/provider preflight tests
maestrobot chat --verbose tails daemon-originated trace logs rather
than inventing client-side logging.
15. Failure model
Important failure behaviors:
- daemon start performs model/provider preflight first
- runtime provider failures are logged verbosely by the daemon
- resident execution contexts are snapshotted before unhealthy shutdown
- tool server failures remain isolated from the rest of the runtime
- paging snapshots and workspaces survive daemon restart
16. Current implementation notes
The first version is real and usable, but still intentionally local and host-centric.
Important present constraints:
- Linux only
systemd --userassumed- one daemon binary and one local root
- browser tooling depends on Playwright Chromium
- image OCR depends on local
tesseract
The core design, though, is now stable:
- user-editable Maestro programs in the runtime root
- durable host-managed channel runtime
- supervised Myria
- unified built-in and MCP tool plane