Agent Engine
The agent engine is the core reasoning component of MateClaw. It receives user input, decides whether to respond directly, call tools, or decompose the task into a multi-step execution plan. MateClaw implements two agent patterns on top of the spring-ai-alibaba-graph-core StateGraph engine: ReAct and Plan-and-Execute.
Architecture Overview
AgentGraphBuilder
|
|-- buildReActAgent() --> StateGraphReActAgent (CompiledGraph)
|-- buildPlanExecuteAgent() --> StateGraphPlanExecuteAgent (CompiledGraph)
|
+-- buildRuntimeChatModel() --> ChatModel (DashScope / OpenAI-compatible / Anthropic)All agents are constructed by AgentGraphBuilder. Based on the AgentEntity configuration (type, iteration limit, etc.), it selects the model, registers tools, compiles the state graph, and returns a ready-to-use BaseAgent instance. The runtime model is always resolved from the global default -- the modelName field on AgentEntity is a legacy field and has no effect.
Core Abstractions
BaseAgent
Every agent extends BaseAgent, which defines the unified conversation interface and state management:
| Method | Description |
|---|---|
chat(userMessage, conversationId) | Synchronous chat, returns complete text |
chatStream(userMessage, conversationId) | Streaming chat, returns Flux<String> |
execute(goal, conversationId) | Execute a task (Plan-Execute entry point; degrades to chat in ReAct) |
chatWithReplay(userMessage, conversationId, toolCallPayload) | Resume tool execution after approval |
chatWithReplayStream(...) | Streaming resume after approval |
BaseAgent also handles loading conversation history (buildConversationHistory), filtering approval placeholder messages, and converting message formats.
AgentState
An agent transitions through these states during its lifecycle:
| State | Description |
|---|---|
IDLE | Ready for a new message |
PLANNING | Generating an execution plan |
EXECUTING | Running tool calls or sub-tasks |
RUNNING | Active ReAct loop or Plan-Execute graph execution |
WAITING_USER_INPUT | Waiting for user input |
DONE | Completed |
FAILED | Execution failed |
ERROR | Error state (streaming exceptions, etc.) |
FinishReason
When the state graph terminates, a finish reason is recorded for frontend and logging use:
| Value | Description |
|---|---|
NORMAL | LLM gave a direct final answer |
SUMMARIZED | Completed after SummarizingNode compression |
MAX_ITERATIONS_REACHED | Forced convergence after hitting the iteration limit |
ERROR_FALLBACK | Degraded answer after an error |
ReAct Agent
StateGraphReActAgent is the default agent type. It implements an explicit, controllable Thought-Action-Observation loop using StateGraph v2.
State Graph Topology
START
|
v
ReasoningNode -----(ReasoningDispatcher)----> 4-way branch
| | | |
v v v v
ActionNode SummarizingNode LimitExceededNode FinalAnswerNode
| | | |
v | v v
ObservationNode | FinalAnswerNode END
| | |
| (ObservationDispatcher) v
| | | | END
v v v v
Back Summ- Limit- Final-
to arizing Exceeded Answer
ReasonFull routing rules:
ReasoningNode --> ReasoningDispatcher:
1. Iteration limit reached --> LimitExceededNode (highest priority)
2. Tool call needed --> ActionNode
3. Summarization needed --> SummarizingNode
4. Can answer directly --> FinalAnswerNode
ActionNode --> ObservationNode (fixed edge)
ObservationNode --> ObservationDispatcher:
0. Awaiting approval --> FinalAnswerNode (graph terminates; replay continues)
1. Iteration limit reached --> LimitExceededNode (forced stop)
2. Error detected --> LimitExceededNode
3. Summarization needed --> SummarizingNode
4. Continue loop --> ReasoningNode
SummarizingNode --> ReasoningNode (fixed edge; re-reason after compression)
LimitExceededNode --> FinalAnswerNode (fixed edge)
FinalAnswerNode --> ENDNode Details
ReasoningNode (Thought Phase)
Calls the LLM for a single reasoning step and determines whether tool calls are needed. Key design: internalToolExecutionEnabled=false disables the ChatModel's built-in tool loop, giving StateGraph full control over ReAct iteration.
Responsibilities:
- Build a Prompt with system prompt, conversation history, and observation history
- Call the LLM and extract the ToolCall list from the response
- Support the
forced_tool_callmechanism: after approval, skip LLM invocation and emit pre-approved tool calls directly - Stream content/thinking deltas in real time via
NodeStreamingChatHelper - Check the stop flag (
ChatStreamTracker.isStopRequested)
ActionNode (Tool Execution Phase)
Delegates to ToolExecutionExecutor for tool execution.
Responsibilities:
- Support concurrent tool execution and approval barriers
- Two-phase execution: sequential ToolGuard checks, then segmented concurrent execution
- In
forced_replaymode, skip ToolGuard checks - When approval is pending, set
AWAITING_APPROVAL=true; the graph terminates at the next dispatcher - Check the stop flag
ObservationNode (Observation Phase)
Processes tool execution results. This is one of the core nodes for iteration control.
Responsibilities:
- Standardize and truncate tool results via
ObservationProcessor - Increment the iteration counter
- Determine whether summarization is needed (when context grows too large)
SummarizingNode (Context Compression)
Routed to when any of the following conditions are met:
- Too many entries in observationHistory
- A single tool result exceeds the size threshold
- Multiple rounds of observations are too verbose for direct output
Calls the LLM to compress existing observations, then routes back to ReasoningNode for continued reasoning.
LimitExceededNode (Iteration Limit Handler)
Routed to when the iteration count reaches maxIterations. Instead of throwing an exception, it:
- Performs inline compression on overly long observationHistory
- Injects a "stop tool calls" system instruction
- Has the LLM generate a concise final answer from available information
- Marks
finishReason = MAX_ITERATIONS_REACHED
FinalAnswerNode (Terminal Node)
Aggregates all termination paths and sets finalAnswer, finalThinking, and finishReason. Priority order:
finalAnswerDraft(from LimitExceeded/Summarizing paths)finalAnswer(from direct Reasoning path)
On the approval-wait path, returns an empty finalAnswer since content has already been streamed.
Example Interaction
User sends: "What is the weather in Beijing today?"
[ReasoningNode] LLM analyzes intent --> needs tool call (WebSearchTool)
[ActionNode] Execute WebSearchTool("Beijing weather today") --> ToolGuard pass
[ObservationNode] Result: Beijing today sunny, 15-26C --> iteration 1/10, continue
[ReasoningNode] LLM determines info is sufficient --> direct answer
[FinalAnswerNode] Aggregate: "Beijing today is sunny, 15-26 C..."Plan-and-Execute Agent
StateGraphPlanExecuteAgent is designed for complex tasks that require multiple steps. It first assesses whether planning is needed -- simple questions get a direct answer, while complex tasks trigger plan generation and step-by-step execution.
State Graph Topology
START
|
v
PlanGenerationNode ---(PlanGenerationDispatcher)--> 2-way branch
| |
v v
StepExecutionNode DirectAnswerNode
| |
| (StepProgressDispatcher) v
| | | | END
v v v v
Continue PlanSummaryNode END
next | (approval stop)
step v
ENDNode Details
PlanGenerationNode
Responsibilities:
- Assess whether planning is needed (simple Q&A exits early to DirectAnswerNode)
- Generate a plan as JSON when planning is required (2-6 steps)
- Persist via
PlanningService.createPlan()to themate_planandmate_sub_plantables - Publish a
plan_createdevent
StepExecutionNode
Executes the current step with an explicit tool execution loop (internalToolExecutionEnabled=false).
- Maximum 5 tool calls per step to prevent runaway loops
- Supports the approval flow: creates a pending approval for guarded tool calls, emits an SSE event, and returns without blocking
- After approval, execution resumes via the replay mechanism
PlanSummaryNode
Aggregates results from all steps, calls the LLM to generate a final summary, and marks the plan as completed via planningService.completePlan().
DirectAnswerNode
When PlanGenerationNode determines no planning is needed, passes direct_answer through as final_summary and ends graph execution.
Data Model
Plans are persisted across two tables:
mate_plan
| Column | Description |
|---|---|
id | Plan ID |
conversation_id | Associated conversation |
goal | The user's original request |
status | pending / running / completed / failed |
mate_sub_plan
| Column | Description |
|---|---|
id | Sub-plan ID |
plan_id | Parent plan |
step_order | Execution order (1, 2, 3...) |
description | What this step does |
status | pending / running / completed / failed |
result | Output from this step |
Example Interaction
User sends: "Research the Spring AI framework, compare major implementations, and write a brief report."
[PlanGenerationNode] Determines planning is needed --> generates 4-step plan
Step 1: Search for main Spring AI implementations and features
Step 2: Search for pros/cons and community activity of each
Step 3: Compare and create a structured table
Step 4: Write a brief research report
[StepExecutionNode] Execute step 1
LLM --> WebSearchTool("Spring AI framework implementations") --> result
Mark step 1 complete
[StepExecutionNode] Execute step 2 (with step 1 context)
...
[StepExecutionNode] Execute step 3
...
[StepExecutionNode] Execute step 4
...
[PlanSummaryNode] Aggregate all step results, generate final reportDynamicAgent
DynamicAgent loads its configuration from the database (mate_agent table) at runtime. Agents can be created and modified through the UI or API without restarting the application.
AgentGraphBuilder.build(AgentEntity) reads the latest AgentEntity on each build, assembling the system prompt, tool set, and state graph.
Database Fields
| Field | Type | Description |
|---|---|---|
name | String | Agent name |
description | String | Agent description |
agent_type | String | react or plan_execute |
system_prompt | Text | System prompt |
max_iterations | Integer | Maximum iteration count (default 10) |
enabled | Boolean | Whether the agent is active |
icon | String | Icon (emoji or URL) |
tags | String | Tags (comma-separated) |
Note: The model_name field is a legacy artifact. The runtime model is always resolved from the global default (configured on the Settings - Models page).
Choosing an Agent Type
When to Use ReAct
- Simple Q&A: a single reasoning step is enough, no tools needed
- Information retrieval: search + summarize, typically 2-3 cycles
- Single-tool operations: read a file, check the time, etc.
- Real-time conversation: scenarios requiring fast responses
When to Use Plan-and-Execute
- Multi-step tasks: multiple operations must be completed in order
- Research reports: multiple searches, comparisons, and summaries
- Complex analysis: gather information first, then synthesize
- Tasks that need trackable progress (each step result is persisted)
| Scenario | Recommended | Reason |
|---|---|---|
| Simple Q&A | ReAct | Fast response; direct-answer path has no overhead |
| Information retrieval | ReAct | Search + summarize in 2-3 cycles |
| Multi-step tasks | Plan-and-Execute | Plan decomposition + step execution + result summary |
| Research reports | Plan-and-Execute | Multiple searches, comparisons, summaries |
| File read/write | ReAct | Clear operation, single tool call |
| Data comparison | Plan-and-Execute | Collect separately then compare |
System Prompt Best Practices
The system prompt is enhanced by AgentGraphBuilder.buildEnhancedPrompt(), which automatically injects skill instructions and workspace context. When writing custom prompts:
- Define the role and responsibilities clearly: tell the agent who it is and what it can do
- Mention available tools: list tool names and when to use them (tool descriptions are auto-injected, but explicit guidance helps the LLM)
- Specify output format: if structured output is needed, state it clearly
- Avoid conflicting instructions: do not contradict MateClaw's built-in behaviors (e.g., tool call format)
Example:
You are a professional technical documentation assistant. Your responsibilities:
1. Search and organize technical materials based on user needs
2. Answer questions using clear, structured formatting
3. Ensure code examples are syntactically correct
4. When unsure, search first rather than fabricating information
Guidelines:
- Cite sources when referencing information
- For time-sensitive questions, get the current date before searchingConfiguration Reference
Agent-Level Configuration
| Setting | Default | Description |
|---|---|---|
maxIterations | 10 | Max ReAct loop cycles; Plan-Execute limits each step to 5 tool calls |
agentType | react | Agent type: react or plan_execute |
enabled | true | Whether the agent is active |
Graph-Level Configuration
| Setting | Description |
|---|---|
recursionLimit | Set at StateGraph compile time: ReAct uses maxIterations * 2 + 5, Plan-Execute uses maxIterations * 3 + 10 |
| ObservationProcessor | Controls tool result truncation thresholds and observation history compression strategy |
Streaming Events
In structured streaming mode (chatStructuredStream), the agent pushes the following SSE event types:
| Event Type | Description |
|---|---|
phase | Current execution phase (reasoning / action / observation / summarizing, etc.) |
tool_call_start | Tool call initiated |
tool_call_end | Tool call completed (with result summary) |
plan_created | Plan created (Plan-Execute mode) |
step_start / step_end | Step started/ended (Plan-Execute mode) |
approval_required | Tool call requires approval |
_usage_final | Token usage statistics (emitted at stream end) |
Managing Agents via API
Create an Agent
curl -X POST http://localhost:18088/api/v1/agents \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"name": "Tech Assistant",
"description": "A professional technical documentation assistant",
"agentType": "react",
"systemPrompt": "You are a professional technical documentation assistant...",
"maxIterations": 10
}'List Agents
curl http://localhost:18088/api/v1/agents \
-H "Authorization: Bearer <token>"Get Agent Details
curl http://localhost:18088/api/v1/agents/1 \
-H "Authorization: Bearer <token>"Update an Agent
curl -X PUT http://localhost:18088/api/v1/agents/1 \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <token>" \
-d '{
"name": "Tech Assistant v2",
"systemPrompt": "Updated prompt...",
"maxIterations": 15
}'Delete an Agent
curl -X DELETE http://localhost:18088/api/v1/agents/1 \
-H "Authorization: Bearer <token>"Streaming Chat
curl -N "http://localhost:18088/api/v1/agents/1/chat/stream?message=hello&conversationId=default" \
-H "Authorization: Bearer <token>"Debugging Guide
Enable DEBUG Logging
Add to application.yml:
logging:
level:
vip.mate.agent: DEBUG
vip.mate.agent.graph: DEBUGThis outputs detailed execution information for each node, including:
- Agent state transitions (
IDLE -> RUNNING -> IDLE) - ReasoningDispatcher / ObservationDispatcher routing decisions
- Iteration counts (
Iteration 1/10) - Tool calls and result summaries
- ToolGuard check results
Common Issues
Agent not responding or timing out
- Verify model configuration on the Settings - Models page
- Check logs for
StateGraph chat failedorERRORstate - Confirm the API key is valid and has remaining quota
Agent stuck in a loop
- Check iteration counts in the logs; verify
maxIterationsis reasonable - Look for tools repeatedly returning errors, causing LLM retries
- If
MAX_ITERATIONS_REACHEDtriggers frequently, increasemaxIterationsor refine the system prompt
Tool calls not working
- Confirm the tool is registered in ToolRegistry
- Check whether ToolGuard is blocking the call
- Review ActionNode log output
Approval flow interrupted
- When awaiting approval, the graph terminates normally (
AWAITING_APPROVAL=true) - After the user decides, execution continues via
chatWithReplay/chatWithReplayStream - If replay fails, verify the
toolCallPayloadformat
Next Steps
- Tool System -- Tools that agents can call and the ToolGuard safety mechanism
- Skill System -- Extend agent capabilities with skills
- Configuration Reference -- Agent and global configuration options
