Page
Memory Architecture
Each agent has four memory tiers. Session-derived memory (tiers 1-3) is automated by choird. Knowledge (tier 4) is explicitly managed by the agent or operator.
Overview
| Tier | Name | Location | Lifecycle | Vectorized | Access |
|---|---|---|---|---|---|
| 1 | Working memory | choir-agent process | Current session | No (in-memory) | Own session only |
| 2 | Mid-term memory | Postgres | Last N sessions (default 10) | Yes | Any agent (read), own agent (write) |
| 3 | Long-term memory | Postgres | Older sessions | Summaries only | Any agent (read), own agent (write) |
| 4 | Knowledge | Postgres | Persistent, agent-managed | Yes | Any agent (read), own agent (write) |
Access Control
| Operation | Scope |
|---|---|
| Write working memory | Own session only (automatic via arbiter) |
| Read working memory | Own session only |
| Write mid-term / long-term | Own agent only (choird-automated) |
| Read mid-term / long-term | Any agent’s (cross-agent reads allowed) |
| Write knowledge | Own agent only |
| Read knowledge | Any agent’s |
Cross-agent reads go through choird’s EXECUTE_HOST_TOOL handler, which uses the admin Postgres connection to query across schemas. Agents never directly access another agent’s schema.
Tier 1: Working Memory
Lives in the choir-agent process. Per-lane (edge and core each maintain their own view).
A. Event Window
The last N committed events (full payloads with hash references), injected directly into the LLM prompt. N is configurable (default ~50 events). When the window fills, oldest events roll off into the reference summary.
Compactable content (eligible for rolling off): UserMsg events
(except the most recent), LLM outputs (ModelOutput), and tool outputs
(ToolResultCommitted).
Never compacted (always present in full):
- System prompt (lane instructions + skill summaries)
- Identity files (
USER.md+SOUL.mdfor edge;SOUL-CORE.mdfor core) - The most recent
UserMsgevent (the message being responded to). Older user messages ARE compactable. CoreJobStart(core lane only)
B. Per-Lane Reference Summary
A mutable structured document summarizing everything that has rolled off the event window. Updated via LLM-generated structured deltas. Edge and core maintain separate summaries.
{
"summary": "...",
"facts": [
{ "key": "...", "value": "...", "source_event": "ev-hash-123" }
],
"referenced_sessions": ["session-abc"],
"referenced_events": ["ev-hash-001", "ev-hash-002"]
}
Updates use a structured delta format to prevent total corruption:
{
"memory_delta": {
"mode": "append | overwrite",
"summary_update": "...",
"add_references": ["hash"],
"remove_references": ["hash"],
"add_fact": { "..." },
"remove_fact": { "..." }
}
}
LLM Prompt Structure
Each lane’s prompt is assembled as a sequence of chat messages. Tool
schemas are NOT in the prompt — they are passed via the OpenAI-compatible
tools API parameter.
Edge lane:
[system: edge behavioral instructions + skill summaries]
[system: USER.md] -- user identity (NEVER compacted)
[system: SOUL.md] -- edge personality (NEVER compacted)
[system: reference summary] -- materialized summary of compacted events
[event window messages] -- user/assistant/tool messages from recent events
Core lane:
[system: core behavioral instructions + skill summaries]
[system: SOUL-CORE.md] -- core personality (NEVER compacted, no USER.md)
[system: reference summary] -- materialized summary of compacted events
[event window messages] -- user/assistant/tool messages from recent events
[system: CoreJobStart] -- task briefing from edge (inserted by core)
Skill summaries are generated from the name and description fields
in each loaded SkillSpec JSON file. They are appended to the system
prompt so the LLM knows which skills are available.
Event window messages are reconstructed in OpenAI message format:
UserMsgevents becomerole: "user"messagesModelOutputevents becomerole: "assistant"messages (includingtool_callsif the model requested tools)ToolResultCommittedevents becomerole: "tool"messages with matchingtool_call_idInjectedInstructionevents becomerole: "system"messages with[INJECTED]prefix
Compaction Triggers
Compaction updates the reference summary by folding the oldest compactable events out of the event window.
| Trigger | Description |
|---|---|
| Automatic | When compactable content exceeds a configurable threshold (default: 80% of context window minus non-compactable content). Measured after each LLM response or tool result. |
| Manual (choirctl) | choirctl session compact <session-id> |
| Manual (gateway) | /compact command |
| Manual (agent) | choir.memory.compact tool |
Compaction runs asynchronously within the lane. The updated reference summary is swapped in atomically when complete. If the compaction LLM call fails, the event window retains its current contents and compaction retries on the next trigger.
Persistence
choird snapshots both lanes’ working memory (reference summaries + event window boundaries) via heartbeat replication. On crash recovery, reference summaries are restored from the snapshot; the event window is rebuilt from the session_events tail.
Tier 2: Mid-Term Memory
Stored in Postgres. Contains the last N sessions (configurable, default 10) with full events chunked and vectorized.
When a session ends (graceful stop), choird:
- Chunks the session events into logical blocks (by tool sequence, skill phase, or fixed size).
- Embeds each chunk via the embedding pipeline.
- Stores in
memory_documentswithtier = 'mid_term'.
Raw session_events rows are retained alongside chunks (redundant but preserves full granularity for replay and audit).
Searchable via choir.memory.query with store: "session", mode semantic or text.
Tier 3: Long-Term Memory
Stored in Postgres. Contains sessions that have aged out of mid-term. Only summaries are vectorized; full event detail is retained but not indexed for vector search.
When a session ages out of mid-term (session count exceeds N):
- The agent’s session summary (generated during graceful shutdown via LLM call) is chunked into logical partitions.
- Summary chunks are embedded and stored as
tier = 'long_term_summary'(vectorized, searchable). - Mid-term event chunks are re-marked as
tier = 'long_term_detail'– kept but vectors are dropped from the HNSW index. - Raw
session_eventsrows remain unchanged.
Default semantic search (via store: "session") hits mid-term chunks + long-term summaries. To drill into a specific long-term session’s full events, the agent uses choir.memory.query with mode session_detail and an explicit session_id.
Tier 4: Knowledge
Stored in Postgres. Separate from session-derived memory. Not automated by choird. Explicitly managed by the agent via choir.memory.upsert or by the operator via choirctl.
Stores persistent facts, user preferences, domain notes, reference material, project context – anything not tied to a specific session.
Supports insert, update-by-key (optional dedup key), and delete. Vectorized and searchable.
Memory Tool Surface
| Tool | store |
Modes | Notes |
|---|---|---|---|
choir.memory.query |
working |
keyword, hash reference | Current session in-memory log. Own agent only. |
choir.memory.query |
session |
semantic, text, session_detail |
Mid-term + long-term summaries. session_detail requires session_id. Cross-agent reads allowed via target_agent. |
choir.memory.query |
knowledge |
semantic, text |
Knowledge base. Cross-agent reads allowed via target_agent. |
choir.memory.upsert |
knowledge |
insert, update-by-key, delete | Own agent’s knowledge store only. |
choir.memory.compact |
working |
(triggers compaction) | Forces reference summary update for the calling lane. |
Session Shutdown Pipeline
During graceful stop (choirctl agent stop / agent update):
- Agent completes current safe point.
- Agent generates session summary (LLM call, structured output). Timeout: 30 seconds (configurable); on timeout, summary is skipped.
- Agent includes summary in final heartbeat payload (if generated).
- Agent flushes all unreplicated events to choird.
- choird persists everything to Postgres.
- choird chunks and embeds session events into
memory_documents(mid-term). - If mid-term session count exceeds N, the oldest session is promoted to long-term (summary chunks vectorized, event chunks retain text but drop vectors).
Embedding Pipeline
Integrated into choird (not a separate service):
| Property | Value |
|---|---|
| Batch size | Up to 32 chunks per API call |
| Queue flush | On heartbeat tick or when queue reaches batch size |
| Retry | Exponential backoff with jitter, max 3 retries (1s/2s/4s base) |
| Graceful degradation | On final failure, store chunk without vector and log warning. Never block tool execution on embedding failure. |
| Fallback search | Chunks without vectors are still searchable via full-text search (tsvector/GIN index) |
The embedding model is configured in config.json (default: text-embedding-3-small, 1536 dimensions). Swapping models requires a re-embedding migration but no code change.
Postgres Schema
Each agent gets its own Postgres schema (choir_<agent_id>) with:
memory_documents– chunked session events and long-term summaries (tsvector + GIN indexed)memory_embeddings– vector embeddings (HNSW indexed, pgvector)knowledge_documents– agent-managed persistent knowledgeknowledge_embeddings– knowledge vector embeddings
Control plane tables live in the shared choir_control schema. See DESIGN.md section 23 for full schema definitions.