Skip to content

Tools

A tool is a hand the agent can reach out with.

Left to its own devices, a language model is a pattern-matcher wrapped in text. It doesn't know what time it is. It doesn't know what's in your files. It can't search the web, run a command, look at a PDF, delegate to another agent, or open a browser. It can only talk about doing those things.

Tools are how MateClaw fixes this. Each tool is a concrete operation the agent is allowed to invoke — read a file, search the web, execute a shell command, extract text from a PDF, delegate to another agent. When the agent decides it needs one, it emits a tool call, the runtime executes it, and the result comes back as an observation.

Fourteen tools ship built-in. Unlimited more can be added through MCP servers, custom skill scripts, or your own @Tool-annotated Spring beans.


How a tool call actually happens

Agent decides it needs a tool


  Emits a tool call:  {"name": "WebSearchTool", "args": {"query": "..."}}


  ┌─────────────────────┐
  │   Tool registry     │  ← look up the tool by name
  └─────────────────────┘


  ┌─────────────────────┐
  │   Tool Guard        │  ← rule-based check: allow / deny / approval
  └─────────────────────┘

   ┌────┴────┐
   │         │
   ▼         ▼
 allowed  approval pending → user decides → allowed / rejected


  ┌─────────────────────┐
  │  Execute (timeout)  │  ← async, per-tool timeout
  └─────────────────────┘


  Result → observation → agent's next reasoning step

Tool Guard is the gatekeeper. Timeouts are per-tool (so one slow tool can't freeze a turn). Execution can be concurrent inside a single Action phase — if the agent calls three independent tools at once, they run in parallel.

None of this shows up in the agent's prompt. The agent just asks for a tool. The runtime handles everything in front of, during, and after the call.


Tool registration — three paths

1. Built-in tools. The twenty tools that ship with MateClaw — registered into the tool table on startup.

2. MCP servers. External processes speaking the Model Context Protocol expose tools dynamically. MateClaw discovers them via tools/list and they appear in the registry alongside built-in ones. See MCP.

Per-agent MCP tool scoping (1.4.0+, #117): when an agent has ticked no specific MCP tool rows, enabled MCP tools auto-join its tool set; once it ticks specific MCP tools, it's restricted to that set. Agents bound to skills / built-in tools only keep full access to all MCP tools.

3. Skill scripts. Skill packages can ship executable scripts that get wrapped as tools at runtime. See Skills.

Tool discovery is blacklist-style — every discoverable tool is registered by default. Exclude specific tools explicitly. Newly added tools don't get silently missed.


Progressive tool disclosure (1.4.0+)

As the tool count grows, the system prompt balloons with dozens of full tool schemas — even when a task needs only one or two of them. Progressive disclosure splits tools into two tiers so the prompt scales with the task, not with the total tool count.

TierHow it appears in the system promptCallable out of the box?
COREAlways advertised in full, with the complete schemaYes
EXTENSIONOnly a compressed directory — name + source + one-line description; the full schema stays hiddenNo — activate with enable_tool first

Default tiering: the generative tools (image_generate, music_generate, video_generate, model3d_generate) and browser_use default to EXTENSION; everything else is CORE.

  • Page control — the Tools page has Core and Extension sections with a per-row tier toggle for built-in and channel tools; MCP / ACP tools are locked.
  • Persistence — the tier is stored in mate_tool.disclosure_tier and mate_mcp_server.disclosure_tier.
  • Configmateclaw.tools.disclosure.mode, default progressive; set it to legacy to restore the old "advertise everything" behavior.

Why — to stop context bloat. The system prompt should scale with what the current task needs, not with how many tools you've installed.


The twenty built-in tools

ToolWhat it doesDangerous
DateTimeToolCurrent date/time in any timezone
WebSearchToolSearch via the provider chain (Serper / Tavily / DuckDuckGo / SearXNG)
ReadFileToolRead file contents
WriteFileToolWrite content to a file⚠️
EditFileToolFind-and-replace edit⚠️
ShellExecuteToolExecute a shell command⚠️
FileTypeDetectorToolDetect MIME type and encoding
DocumentExtractToolExtract text from PDF, DOCX, XLSX
WorkspaceMemoryToolRead/write the agent's workspace memory
SkillFileToolRead and manage SKILL.md files
SkillScriptToolExecute skill scripts⚠️
SkillManageToolCreate / edit / delete skill packages⚠️
BrowserUseToolDrive a headless browser⚠️
DelegateAgentToolDelegate a task to another agent (parallel supported)
MateClawDocToolRead built-in project documentation
ImageGenerateToolText-to-image / image-to-image (1.3.0+)
VideoGenerateToolText-to-video / image-to-video generation
DocxRenderTool1.3.0+ Markdown → .docx (Word document)
XlsxRenderTool1.3.0+ Markdown tables → .xlsx (Excel)
PptxRenderTool1.3.0+ Markdown (Marp-style --- slide breaks) → .pptx
PdfRenderTool1.3.0+ Markdown → publication-grade PDF (CJK fonts embedded)
CronJobToolCreate and manage scheduled tasks⚠️
DatasourceToolManage external datasource connections⚠️
SqlQueryToolExecute SQL queries on connected datasources⚠️
send_file1.4.0+ Deliver an existing server file as a native IM attachment (#199)
enable_tool1.4.0+ Activate an extension-tier tool for this conversation
load_skill1.4.0+ Load a skill's SKILL.md on demand

Plus the MusicGenerateTool from Multimodal. And the 14 Wiki tools from LLM Wiki: wiki_read_page, wiki_read_many, wiki_list_pages, wiki_search_pages, wiki_semantic_search, wiki_compile_page, wiki_trace_source, wiki_create_page, wiki_delete_page, wiki_archive_page, wiki_unarchive_page, wiki_related_pages, wiki_explain_relation, wiki_enrich_page.

DateTimeTool

Returns the current date and time for a given timezone. Zero surprises.

Input:  {"timezone": "America/New_York"}
Output: "2026-04-11T14:30:22"

WebSearchTool

Web search via a provider chain — DuckDuckGo and SearXNG as keyless fallbacks, Serper and Tavily when you have keys. Configured in Settings → System → Search Service and takes effect without restart.

Input:  {"query": "Spring AI Alibaba latest version", "freshness": "month", "count": 5}
Output: "Spring AI Alibaba 1.1 was released..."

Features:

  • Provider chain — falls through to the next on failure. Keyless providers provide baseline coverage.
  • Advanced parametersfreshness (day/week/month/year), language, count.
  • Result caching — recent queries are cached.
  • Security wrapping — results sanitized before return.
  • Provider-native + tool search coexistence — models with their own search (ChatGPT, Gemini) can use that natively while tool search is available as fallback.

ShellExecuteTool

Cross-platform shell execution. Linux/macOS uses /bin/sh -c; Windows uses cmd.exe /D /S /C. Every call is gated by Tool Guard.

Safety design:

  • Timeout — 60s default, 300s hard cap
  • Output caps — stdout and stderr capped at 10,000 bytes each
  • File-backed output — stdout/stderr to temp file, not pipe
  • Structured result{exitCode, stdout, stderr, timedOut}
  • Dangerous-pattern detectionfind -delete, rm -rf /, piped bash downloads trigger elevated approval
Input:  {"command": "ls -la /tmp"}
Output: "total 48\ndrwxrwxrwt 12 root root..."

ReadFileTool / WriteFileTool / EditFileTool

Read is safe. Write and Edit are both gated by Tool Guard.

DocumentExtractTool

PDF, DOCX, XLSX, and friends become plain text. Scanned documents get OCR fallback where available.

Office document generation (1.3.0+)

Four new tools that render Markdown directly into downloadable Office files — no subprocess fork, no npm dependency. Generated bytes are cached in memory and returned as a one-time download URL:

ToolUse forKey capabilities
DocxRenderTool.renderDocxReports / memos / contracts / resumesHeadings (# ## ###) / bold (text) / lists / tables / images (PNG/JPG/GIF/BMP/SVG → PNG)
DocxRenderTool.renderDocxFromFileSame, but markdown is in a workspace fileAvoids the LLM having to repeat its own large markdown body as a tool argument
XlsxRenderTool.renderXlsxFinancial sheets / data exports / templatesMarkdown table syntax → multiple sheets (split by ## SheetName)
PptxRenderTool.renderPptxDecks / project plans / briefingsMarp-style --- slide breaks; 16:9 (default) / 4:3 aspect
PptxRenderTool.renderPptxFromFileSame, but markdown in a filePreferred when the deck body exceeds 5KB
PdfRenderTool.renderPdfPublication-grade documents / weekly reports / templated docs1in margins / smart pagination / page numbers / cover page / mixed CJK + Latin (CJK fonts embedded)

Relationship with the existing skills/docx skill

The skills/docx skill stays — it's good at editing existing .docx (tracked changes, complex XML ops) and runs npm install docx on first use. The four new tools handle the "create-from-scratch" path with no npm warm-up cost. Agents prefer these RenderTools; fall back to the skill only when modifying an existing .docx.

ImageGenerateTool — image edit support from 1.3.0

In v1.2.0 this tool was text-to-image only. v1.3.0 adds two parameters — image and images — for multi-image input editing. See Multimodal.

WorkspaceMemoryTool

Lets an agent read, write, and edit its own workspace memory files — MEMORY.md, PROFILE.md, daily notes, anything under workspace/{agentId}/. Safety rules: .md only, no directory traversal. See Memory.

BrowserUseTool

Drives a headless browser. Navigate, click, type, extract. Every call gated by Tool Guard.

DelegateAgentTool — agents delegating to agents

One agent can hand off a subtask to another:

  • delegateToAgent(agentName, task) — call a specific agent by name, run in isolated conversation, return the result
  • listAvailableAgents() — list all available agents with name, type, description
User: Search for Spring AI news and have Writer summarize it
Agent A: [calls WebSearchTool]
         [calls delegateToAgent(agentName="Writer", task="Summarize: ...")]
         [receives Writer's response]
         Replies with the combined result

Safety:

  • Recursion cap — maximum 3 delegation levels deep
  • Isolated sessions — the delegated agent runs in its own conversation
  • Result truncation — delegated results capped at 4000 characters

MateClawDocTool

Reads the built-in MateClaw project documentation. Lets an agent answer "how does X work in MateClaw" questions by consulting actual docs rather than guessing.

enable_tool — activate an extension-tier tool (1.4.0+)

enable_tool(toolName) activates an EXTENSION-tier tool so it becomes fully callable for the rest of the conversation.

  • Validated — only tools in the agent's effective set can be activated.
  • Takes effect next turn — activation lands on the next reasoning turn of the same ReAct loop (the agent sees the full schema, then emits the real call).
  • Conversation-scoped, not persisted — activation lasts only for the current conversation; nothing is written to the database, and a new conversation reverts to the default tiering.

load_skill — load a skill on demand (1.4.0+)

load_skill(skillName, filePath?) pulls a skill's SKILL.md in only when it's needed — omit filePath for the main file, or pass one to read a sub-file inside the skill package.

  • Injected via message history — the loaded content goes into message history, not the system prompt, so the prompt cache stays stable (the system prompt is unchanged, so the cache isn't invalidated).
  • Pinned in later turns — a loaded skill stays pinned for the rest of the conversation, so it doesn't have to be reloaded.
  • Configmateclaw.skill.disclosure.load-skill-tool.enabled, default true.

See Skills.

send_file — deliver an existing file as a native attachment (1.4.0+, #199)

send_file(filePath, fileName?) reads an existing file on the server and delivers it as a native IM attachment — not a text download link.

  • Stored in the generated-file cache — the file is placed in the generated-file cache, and channel adapters (Feishu / DingTalk / Telegram) auto-detect and deliver it.
  • Any common file type, up to a 20 MB limit.
  • Contrast with ReadFileToolReadFileTool extracts text from a file to feed the agent's reasoning; send_file ships the file as-is to the user.

ReadFileTool — oversized-line paging (1.4.0+, #190)

For files with a very long single line, ReadFileTool adds an optional startColumn (a 1-based character offset within startLine) to resume the tail of that line from where you left off.

  • On truncation it always returns nextStartLine;
  • it additionally returns nextStartColumn when more of that line remains.

Feed both back into the next call to page through a giant single-line file in segments.


Tool Guard — the permission layer

Tool Guard is how MateClaw keeps strong tools from doing stupid things. It's rule-based, not a flat dangerous-tools list. Each rule says: for this tool, with these arguments, in this context, do X — where X is allow, deny, or require_approval.

Core pieces:

  • mate_tool_guard_rule — individual rules with tool pattern, optional arg pattern, action
  • mate_tool_guard_config — global config: enabled/disabled, default policy, approval timeout
  • mate_tool_guard_audit_log — every guarded call leaves an entry

Example rule: allow ShellExecuteTool when the command starts with ls, cat, grep, or find. Require approval for anything else.

yaml
mateclaw:
  tool:
    guard:
      enabled: true
      default-policy: require_approval
      rules:
        - tool: ShellExecuteTool
          arg-pattern: "^(ls|cat|grep|find)\\s"
          action: allow
        - tool: WriteFileTool
          action: require_approval

Or manage interactively on Settings → Security & Approval. When a rule requires approval, the runtime persists a row in mate_tool_approval and suspends the agent turn. When the user decides, the agent resumes where it paused. Full mechanism in Security & Approval.

Declarative hook system

Tool Guard rules are a special case of a more general mechanism — the declarative hook system. Five lifecycle hooks cover every critical moment in tool and LLM execution:

HookFires whenTypical use
before_toolBefore tool executionArgument redaction, context injection, extra validation
after_toolAfter tool executionResult filtering, audit logging
before_llmBefore LLM callPrompt enrichment, cache hit check
after_llmAfter LLM returnsOutput filtering, token accounting
on_errorOn errorAlerting, fallback strategy

Hooks run in-process. They can transform arguments, transform results, mask sensitive fields, and add audit log entries. You can use hooks for things beyond Tool Guard — like injecting a security policy before every LLM call, or auto-redacting sensitive fields from tool returns.


Execution: concurrent, isolated, bounded

  • Concurrent execution — within a turn, independent tool calls run in parallel. Guard checks are sequential; execution is concurrent where safe.
  • Per-tool timeouts — every tool has its own timeout. Defaults: fast tools 30s, shell/browser 60s, generation tools up to 300s.
  • Segment isolation — when approvals are needed mid-turn, the segment splits at the approval boundary.
  • Observation truncation — long tool results are automatically truncated before being added to observation history.
  • Error isolation — one tool failure does not abort the turn.

Tool management via API

bash
# List all tools
curl http://localhost:18088/api/v1/tools \
  -H "Authorization: Bearer YOUR_JWT_TOKEN"

# Enable / disable
curl -X PUT http://localhost:18088/api/v1/tools/1 \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{"enabled": false}'

# Set disclosure tier for a builtin or channel tool
curl -X PUT http://localhost:18088/api/v1/tools/1/disclosure-tier \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_JWT_TOKEN" \
  -d '{"tier": "core"}'

The current REST API manages tool rows, enabled state, and disclosure tier. Direct execution of builtin tools happens through the agent runtime, not through a /tools/{name}/test endpoint.


Creating a custom tool

Option 1: a @Tool-annotated Spring bean

java
@Component
public class FactorialTool {

    @Tool(description = "Calculate the factorial of a number")
    public String factorial(
            @ToolParam(description = "The number to compute factorial for") int n) {
        long result = 1;
        for (int i = 2; i <= n; i++) {
            result *= i;
        }
        return String.valueOf(result);
    }
}
  • Spring @Component
  • Every @Tool method becomes a callable tool
  • Use @ToolParam on every parameter — that's the LLM description
  • Return value is what the agent sees
  • If the tool is dangerous, add a Tool Guard rule for it

Restart and the tool is live.

Option 2: a skill script

Don't want to write Java? Bundle behavior into a skill package with a SKILL.md and a script. See Skills.

Option 3: an MCP server

Capability already exists as an MCP server? Just add the server configuration. See MCP.


Next

  • Skills — higher-level capabilities built on tools
  • MCP — external tool providers
  • Security & Approval — Tool Guard rules, approval flow, audit log
  • Multimodal — generation tools (image, video, music, TTS, STT)