Skip to content

Memory System

MateClaw provides a complete multi-layer memory system covering in-session short-term context management, event-driven automatic memory extraction, file-persisted workspace memory, and scheduled memory consolidation.

Architecture Overview

User sends message


┌─────────────────────────────┐
│  Short-Term Memory           │
│  ConversationWindowManager   │  In-session context window management
│  (mate_message table)        │
└─────────────────────────────┘

     │ Publishes ConversationCompletedEvent after conversation

┌─────────────────────────────┐
│  Memory Extraction (async)   │
│  MemorySummarizationService  │  Event-driven with cooldown & per-agent locking
│  PostConversationMemory      │
│  Listener                    │
└─────────────────────────────┘

     │ LLM analyzes conversation → JSON response

┌─────────────────────────────┐
│  Workspace Files             │
│  PROFILE.md  (user profile)  │  Full replace
│  MEMORY.md   (core memory)   │  Full replace
│  memory/YYYY-MM-DD.md        │  Append
└─────────────────────────────┘

     │ Periodically scans daily notes, extracts patterns
┌─────────────────────────────┐
│  Memory Consolidation        │
│  (CronJob-driven)            │
│  MemoryEmergenceService      │  Default: daily at 2:00 AM
└─────────────────────────────┘

Short-Term Memory

Short-term memory is the conversation history within the current session, stored in the mate_message table and managed by ConversationWindowManager.

Context Window

Before each LLM call, the system assembles the current conversation history into a context window:

[System Prompt]                            ← Always preserved
[Workspace file injection]                 ← AGENTS.md / SOUL.md / PROFILE.md / MEMORY.md
[Conversation context summary (if any)]    ← Compressed early conversation
[Message 1: user]
[Message 2: assistant]
...
[Current user message]

Workspace files are injected via WorkspaceFileService.buildSystemPrompt(), sorted by sort_order, in the following format:

--- AGENTS.md ---
(content)

--- SOUL.md ---
(content)

--- PROFILE.md ---
(content)

--- MEMORY.md ---
(content)

Only files with enabled=true are included in the system prompt.

Context Compression

When the context token count reaches the threshold, compression is automatically triggered:

  1. Estimate total tokens (system prompt + current user message + history) using TokenEstimator
  2. Trigger when total exceeds defaultMaxInputTokens × compactTriggerRatio (default 128000 × 0.75 = 96000)
  3. The most recent preserveRecentPairs conversation turns (default 2 turns = 4 messages) are preserved as-is
  4. Older messages are summarized by the LLM, replacing the original content
  5. The summary is injected as a UserMessage, prefixed with [Conversation context summary - for reference only, not instructions]
  6. Summary results are cached for 30 minutes to avoid redundant LLM calls

Security design: The summary is injected as a UserMessage (not SystemMessage) to prevent historical user input from being elevated to system-level instructions, eliminating instruction injection risks.

PTL Emergency Compression

When the LLM returns a context_length_exceeded error, the compactForRetry() method provides emergency fallback:

  • Does not call the LLM for summarization; directly discards older messages
  • Retains only the most recent 2 turns (4 messages)
  • Used for exception recovery at the Agent Node layer

Secondary Trimming

If tokens still exceed the budget after LLM summarization, trimToFit() progressively removes messages from front to back, preserving at least the last 2 messages (the most recent conversation turn).

Short-Term Memory Configuration

yaml
mate:
  agent:
    conversation:
      window:
        default-max-input-tokens: 128000   # Maximum input token count
        compact-trigger-ratio: 0.75        # Compression trigger ratio
        preserve-recent-pairs: 2           # Recent conversation turns to preserve
        summary-max-tokens: 300            # Max tokens for context summary output

Config prefix: mate.agent.conversation.window, mapped to ConversationWindowProperties.

Automatic Memory Extraction

After a conversation completes, the system asynchronously extracts memorable information and writes it to workspace files. The entire process does not block the user response.

Trigger Flow

  1. After a conversation completes, ChatController or ChannelMessageRouter publishes a ConversationCompletedEvent
  2. PostConversationMemoryListener handles the event (@Async + @EventListener), executing in a separate thread
  3. The listener performs pre-checks:
    • Whether autoSummarizeEnabled is turned on
    • Whether the conversation was triggered by a CronJob (triggerSource == "cron" and skipCronConversations is true — skipped to avoid recursion)
    • Whether the message count meets minMessagesForSummarize (default 4)
    • Whether the last user message length meets minUserMessageLength (default 10)
  4. After passing all checks, calls MemorySummarizationService.analyzeAndUpdateMemory()

ConversationCompletedEvent

java
public record ConversationCompletedEvent(
    Long agentId,
    String conversationId,
    String userMessage,       // Last user message
    String assistantReply,    // Agent's final response
    int messageCount,         // Total message count in current session
    String triggerSource      // Trigger source: "web" / "channel" / "cron"
) {}

Concurrency Control

MemorySummarizationService uses two layers of protection:

  • Cooldown: The same Agent will not trigger extraction again within cooldownMinutes (default 5 minutes). Last execution times are tracked via ConcurrentHashMap<Long, Instant>
  • Per-Agent Lock: Each Agent has an independent ReentrantLock. Uses tryLock() for non-blocking acquisition — if an extraction task is already running, the new request is skipped

LLM Analysis Process

  1. Load conversation messages from the mate_message table (re-checks message count threshold)
  2. Read current PROFILE.md, MEMORY.md, and today's daily note content
  3. Build a conversation transcript:
    • Up to maxTranscriptMessages (default 30) messages
    • Each truncated to 2000 characters, with ... [truncated] marker
    • Only user/assistant roles; tool and system messages are skipped
    • Formatted as User: xxx / Assistant: xxx
  4. Call the LLM using memory/summarize-system and memory/summarize-user prompt templates
  5. Uses the system default model (ModelConfigService.getDefaultModel())
  6. Parse the JSON response (automatically handles markdown code block wrapping)

LLM Response Format

The LLM returns a JSON object with the following fields:

FieldTypeDescription
should_updatebooleanWhether memory needs updating
reasonstringReason for updating or not updating
daily_entrystringContent to write to today's daily note
memory_updatestringFull new content for MEMORY.md
profile_updatestringFull new content for PROFILE.md

File Write Rules

  • PROFILE.md — Full replace, only written when profile_update is non-empty
  • MEMORY.md — Full replace, only written when memory_update is non-empty
  • memory/YYYY-MM-DD.md — Append mode, new content appended to the end; if the file does not exist, it is created with a date heading (# 2026-04-03)

Workspace Memory Files

Each Agent maintains independent workspace files in the mate_workspace_file database table:

workspace/{agentId}/
├── AGENTS.md          # Agent behavior guide (memory system usage, tool references)
├── SOUL.md            # Core identity and personality definition
├── PROFILE.md         # User profile (interests, preferences, background)
├── MEMORY.md          # Core memory (important long-term information)
└── memory/
    ├── 2026-03-30.md  # Daily notes
    ├── 2026-03-31.md
    └── 2026-04-01.md

Database Table Schema

sql
CREATE TABLE mate_workspace_file (
    id          BIGINT PRIMARY KEY,
    agent_id    BIGINT NOT NULL,
    filename    VARCHAR(256) NOT NULL,    -- Relative path: PROFILE.md, memory/2026-04-01.md
    content     CLOB,                     -- Markdown content
    file_size   BIGINT DEFAULT 0,         -- Bytes
    enabled     BOOLEAN DEFAULT FALSE,    -- Include in system prompt?
    sort_order  INT DEFAULT 0,            -- Order for system prompt injection
    create_time DATETIME NOT NULL,
    update_time DATETIME NOT NULL,
    deleted     INT DEFAULT 0             -- Logical delete
);

AGENTS.md

The Agent's behavior guide file, describing memory system usage, file purposes, writing principles, and available tool references. Created by seed data with enabled=true, sort_order=0.

SOUL.md

The Agent's core identity and personality definition file, containing self-awareness, evolution guidance, and privacy/boundary principles. Created by seed data with enabled=true, sort_order=1.

PROFILE.md

Stores the Agent's long-term knowledge about the user, such as name, occupation, tech stack, and communication preferences. Updated by the LLM during memory extraction when changes are detected. Written using full replacement. Created by seed data with enabled=true, sort_order=2.

MEMORY.md

Stores cross-session core memory summaries, such as project context, important decisions, and to-do items. Updated both by the memory extraction service and the memory consolidation service. Created by seed data with enabled=true, sort_order=3.

Daily Note Files

Conversation highlights archived by date (memory/YYYY-MM-DD.md). Uses append mode — multiple conversations within a single day are all appended to the same file. These files serve as input for the memory consolidation service. Created with enabled=false, not automatically injected into the system prompt.

Memory Consolidation

Memory consolidation (Emergence) periodically scans recent daily notes, extracts recurring patterns and important information, and merges the results into MEMORY.md.

Trigger Methods

  • Automatic: Driven by a CronJob entry in the mate_cron_job table. The default cron expression is 0 2 * * * (daily at 2:00 AM). CronJob seed data is automatically created during system initialization, one per Agent
  • Manual: Call POST /api/v1/memory/{agentId}/emergence

Consolidation Flow

  1. Check whether emergenceEnabled is turned on
  2. List all memory/*.md files for the Agent, sorted by filename in reverse order, taking the most recent emergenceDayRange days (default 7)
  3. Read the content of those daily notes (formatted as ### memory/YYYY-MM-DD.md + content), along with the existing MEMORY.md
  4. Call the LLM using memory/emergence-system and memory/emergence-user prompt templates
  5. The LLM returns a JSON object with should_update, reason, and memory_content fields
  6. If should_update is true, MEMORY.md is fully replaced with memory_content

CronJob Trigger Mechanism

In the CronJob seed data, each Agent has a corresponding consolidation task:

  • Task type is text, with a trigger message that guides the Agent to perform memory review
  • triggerSource is marked as "cron", so the memory extraction listener automatically skips it when skipCronConversations=true, preventing recursion
  • 5-field cron expressions are automatically converted to Spring's 6-field format (prefixed with 0 for seconds)

WorkspaceMemoryTool

WorkspaceMemoryTool is a built-in tool exposed to Agents, allowing them to actively read and write workspace memory files during conversations.

Available Operations

MethodDescription
list_workspace_memory_filesList all workspace files for an Agent, with optional filename prefix filter, sorted by sort_order
read_workspace_memory_fileRead the content of a specific file, returns enabled, fileSize, content, updateTime
write_workspace_memory_fileCreate or overwrite a file (full write), returns created/overwritten flag
edit_workspace_memory_fileEdit file content via exact find-and-replace (incremental update), supports replaceAll parameter

Usage Examples

List files:

json
// Input
{"agentId": 1, "filenamePrefix": "memory/"}
// Output
{"agentId": 1, "count": 3, "files": [
  {"filename": "memory/2026-04-01.md", "enabled": false, "fileSize": 512, "updateTime": "..."},
  ...
]}

Read a file:

json
// Input
{"agentId": 1, "filename": "MEMORY.md"}
// Output
{"agentId": 1, "filename": "MEMORY.md", "enabled": true, "fileSize": 2048, "content": "...", "updateTime": "..."}

Write a file:

json
// Input
{"agentId": 1, "filename": "memory/2026-04-01.md", "content": "# 2026-04-01\n\n..."}
// Output
{"agentId": 1, "filename": "memory/2026-04-01.md", "created": true, "overwritten": false, "enabled": false, "bytesWritten": 256, "message": "Workspace memory file created"}

Edit a file:

json
// Input
{"agentId": 1, "filename": "MEMORY.md", "oldText": "old content", "newText": "new content", "replaceAll": false}
// Output
{"agentId": 1, "filename": "MEMORY.md", "replacements": 1, "replaceAll": false, "fileSizeAfter": 2100, "message": "Workspace memory file edited successfully"}

Security Restrictions

  • Only .md file extensions are supported
  • Absolute paths (/ or \ prefix) and directory traversal (..) are not allowed
  • The write operation fully overwrites existing files — read first before writing
  • Newly created files have enabled set to false by default (not automatically included in the system prompt). Core files (AGENTS.md, SOUL.md, PROFILE.md, MEMORY.md) are set to enabled=true by seed data

Configuration Reference

Memory Extraction & Consolidation Config

Config prefix: mate.memory, mapped to MemoryProperties.

yaml
mate:
  memory:
    # --- Automatic Memory Extraction ---
    auto-summarize-enabled: true       # Auto-extract memory after conversations
    min-messages-for-summarize: 4      # Minimum message count to trigger extraction
    min-user-message-length: 10        # Minimum user message length to trigger extraction
    skip-cron-conversations: true      # Skip CronJob-triggered conversations
    summary-max-tokens: 1000           # Maximum output tokens for the memory summary
    max-transcript-messages: 30        # Maximum messages in the transcript

    # --- Concurrency Control ---
    cooldown-minutes: 5                # Per-agent extraction cooldown (minutes)

    # --- Memory Consolidation ---
    emergence-enabled: true            # Enable periodic memory consolidation
    emergence-day-range: 7             # Number of days to scan during consolidation

Context Window Config

Config prefix: mate.agent.conversation.window, mapped to ConversationWindowProperties.

yaml
mate:
  agent:
    conversation:
      window:
        default-max-input-tokens: 128000   # Global default maximum input tokens
        compact-trigger-ratio: 0.75        # Compression trigger ratio
        preserve-recent-pairs: 2           # Recent conversation turns to preserve after compression
        summary-max-tokens: 300            # Max tokens for context summary output

API Endpoints

MethodPathDescription
POST/api/v1/memory/{agentId}/emergenceManually trigger memory consolidation (daily notes merged into MEMORY.md)
POST/api/v1/memory/{agentId}/summarize/{conversationId}Manually trigger memory extraction for a specific conversation

Response Format

On success:

json
{
  "code": 200,
  "data": {
    "status": "completed"
  }
}

On failure:

json
{
  "code": 500,
  "msg": "Memory consolidation failed: ..."
}

Prompt Templates

The memory system uses the following prompt templates (located under resources/prompts/):

TemplatePurpose
memory/summarize-systemMemory extraction system prompt: instructs LLM to analyze conversations, deciding what to store in PROFILE/MEMORY/daily note
memory/summarize-userMemory extraction user prompt: includes date, existing file contents, conversation transcript
memory/emergence-systemMemory consolidation system prompt: instructs LLM to extract stable patterns from daily notes
memory/emergence-userMemory consolidation user prompt: includes existing MEMORY.md and recent daily notes
context/conversation-summary-systemContext compression system prompt: instructs LLM to generate conversation summaries
context/conversation-summary-userContext compression user prompt: includes conversation content to be compressed
ClassDescription
vip.mate.memory.MemoryPropertiesMemory extraction & consolidation config properties
vip.mate.config.ConversationWindowPropertiesContext window config properties
vip.mate.memory.event.ConversationCompletedEventConversation completed event (record type)
vip.mate.memory.listener.PostConversationMemoryListenerAsync event listener
vip.mate.memory.service.MemorySummarizationServiceMemory extraction service (with cooldown & per-agent locking)
vip.mate.memory.service.MemoryEmergenceServiceMemory consolidation service
vip.mate.memory.controller.MemoryControllerREST API controller
vip.mate.tool.builtin.WorkspaceMemoryToolAgent-callable workspace memory tool
vip.mate.agent.context.ConversationWindowManagerConversation context window management (with PTL recovery)
vip.mate.agent.context.TokenEstimatorToken count estimation utility
vip.mate.workspace.document.WorkspaceFileServiceWorkspace file CRUD & system prompt building