Memory Architecture | Peisong's Lighthouse

Each agent has four memory tiers. Session-derived memory (tiers 1-3) is automated by choird. Knowledge (tier 4) is explicitly managed by the agent or operator.

Overview

Tier	Name	Location	Lifecycle	Vectorized	Access
1	Working memory	choir-agent process	Current session	No (in-memory)	Own session only
2	Mid-term memory	Postgres	Last N sessions (default 10)	Yes	Any agent (read), own agent (write)
3	Long-term memory	Postgres	Older sessions	Summaries only	Any agent (read), own agent (write)
4	Knowledge	Postgres	Persistent, agent-managed	Yes	Any agent (read), own agent (write)

Access Control

Operation	Scope
Write working memory	Own session only (automatic via arbiter)
Read working memory	Own session only
Write mid-term / long-term	Own agent only (choird-automated)
Read mid-term / long-term	Any agent’s (cross-agent reads allowed)
Write knowledge	Own agent only
Read knowledge	Any agent’s

Cross-agent reads go through choird’s EXECUTE_HOST_TOOL handler, which uses the admin Postgres connection to query across schemas. Agents never directly access another agent’s schema.

Tier 1: Working Memory

Lives in the choir-agent process. Per-lane (edge and core each maintain their own view).

A. Event Window

The last N committed events (full payloads with hash references), injected directly into the LLM prompt. N is configurable (default ~50 events). When the window fills, oldest events roll off into the reference summary.

Compactable content (eligible for rolling off): UserMsg events (except the most recent), LLM outputs (ModelOutput), and tool outputs (ToolResultCommitted).

Never compacted (always present in full):

System prompt (lane instructions + skill summaries)
Identity files (USER.md + SOUL.md for edge; SOUL-CORE.md for core)
The most recent UserMsg event (the message being responded to). Older user messages ARE compactable.
CoreJobStart (core lane only)

B. Per-Lane Reference Summary

A mutable structured document summarizing everything that has rolled off the event window. Updated via LLM-generated structured deltas. Edge and core maintain separate summaries.

{
  "summary": "...",
  "facts": [
    { "key": "...", "value": "...", "source_event": "ev-hash-123" }
  ],
  "referenced_sessions": ["session-abc"],
  "referenced_events": ["ev-hash-001", "ev-hash-002"]
}

Updates use a structured delta format to prevent total corruption:

{
  "memory_delta": {
    "mode": "append | overwrite",
    "summary_update": "...",
    "add_references": ["hash"],
    "remove_references": ["hash"],
    "add_fact": { "..." },
    "remove_fact": { "..." }
  }
}

LLM Prompt Structure

Each lane’s prompt is assembled as a sequence of chat messages. Tool schemas are NOT in the prompt — they are passed via the OpenAI-compatible tools API parameter.

Edge lane:

[system: edge behavioral instructions + skill summaries]
[system: USER.md]            -- user identity (NEVER compacted)
[system: SOUL.md]            -- edge personality (NEVER compacted)
[system: reference summary]  -- materialized summary of compacted events
[event window messages]      -- user/assistant/tool messages from recent events

Core lane:

[system: core behavioral instructions + skill summaries]
[system: SOUL-CORE.md]      -- core personality (NEVER compacted, no USER.md)
[system: reference summary]  -- materialized summary of compacted events
[event window messages]      -- user/assistant/tool messages from recent events
[system: CoreJobStart]       -- task briefing from edge (inserted by core)

Skill summaries are generated from the name and description fields in each loaded SkillSpec JSON file. They are appended to the system prompt so the LLM knows which skills are available.

Event window messages are reconstructed in OpenAI message format:

UserMsg events become role: "user" messages
ModelOutput events become role: "assistant" messages (including tool_calls if the model requested tools)
ToolResultCommitted events become role: "tool" messages with matching tool_call_id
InjectedInstruction events become role: "system" messages with [INJECTED] prefix

Compaction Triggers

Compaction updates the reference summary by folding the oldest compactable events out of the event window.

Trigger	Description
Automatic	When compactable content exceeds a configurable threshold (default: 80% of context window minus non-compactable content). Measured after each LLM response or tool result.
Manual (choirctl)	`choirctl session compact <session-id>`
Manual (gateway)	`/compact` command
Manual (agent)	`choir.memory.compact` tool

Compaction runs asynchronously within the lane. The updated reference summary is swapped in atomically when complete. If the compaction LLM call fails, the event window retains its current contents and compaction retries on the next trigger.

Persistence

choird snapshots both lanes’ working memory (reference summaries + event window boundaries) via heartbeat replication. On crash recovery, reference summaries are restored from the snapshot; the event window is rebuilt from the session_events tail.

Tier 2: Mid-Term Memory

Stored in Postgres. Contains the last N sessions (configurable, default 10) with full events chunked and vectorized.

When a session ends (graceful stop), choird:

Chunks the session events into logical blocks (by tool sequence, skill phase, or fixed size).
Embeds each chunk via the embedding pipeline.
Stores in memory_documents with tier = 'mid_term'.

Raw session_events rows are retained alongside chunks (redundant but preserves full granularity for replay and audit).

Searchable via choir.memory.query with store: "session", mode semantic or text.

Tier 3: Long-Term Memory

Stored in Postgres. Contains sessions that have aged out of mid-term. Only summaries are vectorized; full event detail is retained but not indexed for vector search.

When a session ages out of mid-term (session count exceeds N):

The agent’s session summary (generated during graceful shutdown via LLM call) is chunked into logical partitions.
Summary chunks are embedded and stored as tier = 'long_term_summary' (vectorized, searchable).
Mid-term event chunks are re-marked as tier = 'long_term_detail' – kept but vectors are dropped from the HNSW index.
Raw session_events rows remain unchanged.

Default semantic search (via store: "session") hits mid-term chunks + long-term summaries. To drill into a specific long-term session’s full events, the agent uses choir.memory.query with mode session_detail and an explicit session_id.

Tier 4: Knowledge

Stored in Postgres. Separate from session-derived memory. Not automated by choird. Explicitly managed by the agent via choir.memory.upsert or by the operator via choirctl.

Stores persistent facts, user preferences, domain notes, reference material, project context – anything not tied to a specific session.

Supports insert, update-by-key (optional dedup key), and delete. Vectorized and searchable.

Memory Tool Surface

Tool	`store`	Modes	Notes
`choir.memory.query`	`working`	keyword, hash reference	Current session in-memory log. Own agent only.
`choir.memory.query`	`session`	`semantic`, `text`, `session_detail`	Mid-term + long-term summaries. `session_detail` requires `session_id`. Cross-agent reads allowed via `target_agent`.
`choir.memory.query`	`knowledge`	`semantic`, `text`	Knowledge base. Cross-agent reads allowed via `target_agent`.
`choir.memory.upsert`	`knowledge`	insert, update-by-key, delete	Own agent’s knowledge store only.
`choir.memory.compact`	`working`	(triggers compaction)	Forces reference summary update for the calling lane.

Session Shutdown Pipeline

During graceful stop (choirctl agent stop / agent update):

Agent completes current safe point.
Agent generates session summary (LLM call, structured output). Timeout: 30 seconds (configurable); on timeout, summary is skipped.
Agent includes summary in final heartbeat payload (if generated).
Agent flushes all unreplicated events to choird.
choird persists everything to Postgres.
choird chunks and embeds session events into memory_documents (mid-term).
If mid-term session count exceeds N, the oldest session is promoted to long-term (summary chunks vectorized, event chunks retain text but drop vectors).

Embedding Pipeline

Integrated into choird (not a separate service):

Property	Value
Batch size	Up to 32 chunks per API call
Queue flush	On heartbeat tick or when queue reaches batch size
Retry	Exponential backoff with jitter, max 3 retries (1s/2s/4s base)
Graceful degradation	On final failure, store chunk without vector and log warning. Never block tool execution on embedding failure.
Fallback search	Chunks without vectors are still searchable via full-text search (tsvector/GIN index)

The embedding model is configured in config.json (default: text-embedding-3-small, 1536 dimensions). Swapping models requires a re-embedding migration but no code change.

Postgres Schema

Each agent gets its own Postgres schema (choir_<agent_id>) with:

memory_documents – chunked session events and long-term summaries (tsvector + GIN indexed)
memory_embeddings – vector embeddings (HNSW indexed, pgvector)
knowledge_documents – agent-managed persistent knowledge
knowledge_embeddings – knowledge vector embeddings

Control plane tables live in the shared choir_control schema. See DESIGN.md section 23 for full schema definitions.