Add ez-assistant and kerberos service folders

This commit is contained in:
kelin
2026-02-11 14:56:03 -05:00
parent e4e8ae1b87
commit 9ccfb36923
4471 changed files with 746463 additions and 0 deletions

View File

@@ -0,0 +1,126 @@
---
summary: "Agent loop lifecycle, streams, and wait semantics"
read_when:
- You need an exact walkthrough of the agent loop or lifecycle events
---
# Agent Loop (Moltbot)
An agentic loop is the full “real” run of an agent: intake → context assembly → model inference →
tool execution → streaming replies → persistence. Its the authoritative path that turns a message
into actions and a final reply, while keeping session state consistent.
In Moltbot, a loop is a single, serialized run per session that emits lifecycle and stream events
as the model thinks, calls tools, and streams output. This doc explains how that authentic loop is
wired end-to-end.
## Entry points
- Gateway RPC: `agent` and `agent.wait`.
- CLI: `agent` command.
## How it works (high-level)
1) `agent` RPC validates params, resolves session (sessionKey/sessionId), persists session metadata, returns `{ runId, acceptedAt }` immediately.
2) `agentCommand` runs the agent:
- resolves model + thinking/verbose defaults
- loads skills snapshot
- calls `runEmbeddedPiAgent` (pi-agent-core runtime)
- emits **lifecycle end/error** if the embedded loop does not emit one
3) `runEmbeddedPiAgent`:
- serializes runs via per-session + global queues
- resolves model + auth profile and builds the pi session
- subscribes to pi events and streams assistant/tool deltas
- enforces timeout -> aborts run if exceeded
- returns payloads + usage metadata
4) `subscribeEmbeddedPiSession` bridges pi-agent-core events to Moltbot `agent` stream:
- tool events => `stream: "tool"`
- assistant deltas => `stream: "assistant"`
- lifecycle events => `stream: "lifecycle"` (`phase: "start" | "end" | "error"`)
5) `agent.wait` uses `waitForAgentJob`:
- waits for **lifecycle end/error** for `runId`
- returns `{ status: ok|error|timeout, startedAt, endedAt, error? }`
## Queueing + concurrency
- Runs are serialized per session key (session lane) and optionally through a global lane.
- This prevents tool/session races and keeps session history consistent.
- Messaging channels can choose queue modes (collect/steer/followup) that feed this lane system.
See [Command Queue](/concepts/queue).
## Session + workspace preparation
- Workspace is resolved and created; sandboxed runs may redirect to a sandbox workspace root.
- Skills are loaded (or reused from a snapshot) and injected into env and prompt.
- Bootstrap/context files are resolved and injected into the system prompt report.
- A session write lock is acquired; `SessionManager` is opened and prepared before streaming.
## Prompt assembly + system prompt
- System prompt is built from Moltbots base prompt, skills prompt, bootstrap context, and per-run overrides.
- Model-specific limits and compaction reserve tokens are enforced.
- See [System prompt](/concepts/system-prompt) for what the model sees.
## Hook points (where you can intercept)
Moltbot has two hook systems:
- **Internal hooks** (Gateway hooks): event-driven scripts for commands and lifecycle events.
- **Plugin hooks**: extension points inside the agent/tool lifecycle and gateway pipeline.
### Internal hooks (Gateway hooks)
- **`agent:bootstrap`**: runs while building bootstrap files before the system prompt is finalized.
Use this to add/remove bootstrap context files.
- **Command hooks**: `/new`, `/reset`, `/stop`, and other command events (see Hooks doc).
See [Hooks](/hooks) for setup and examples.
### Plugin hooks (agent + gateway lifecycle)
These run inside the agent loop or gateway pipeline:
- **`before_agent_start`**: inject context or override system prompt before the run starts.
- **`agent_end`**: inspect the final message list and run metadata after completion.
- **`before_compaction` / `after_compaction`**: observe or annotate compaction cycles.
- **`before_tool_call` / `after_tool_call`**: intercept tool params/results.
- **`tool_result_persist`**: synchronously transform tool results before they are written to the session transcript.
- **`message_received` / `message_sending` / `message_sent`**: inbound + outbound message hooks.
- **`session_start` / `session_end`**: session lifecycle boundaries.
- **`gateway_start` / `gateway_stop`**: gateway lifecycle events.
See [Plugins](/plugin#plugin-hooks) for the hook API and registration details.
## Streaming + partial replies
- Assistant deltas are streamed from pi-agent-core and emitted as `assistant` events.
- Block streaming can emit partial replies either on `text_end` or `message_end`.
- Reasoning streaming can be emitted as a separate stream or as block replies.
- See [Streaming](/concepts/streaming) for chunking and block reply behavior.
## Tool execution + messaging tools
- Tool start/update/end events are emitted on the `tool` stream.
- Tool results are sanitized for size and image payloads before logging/emitting.
- Messaging tool sends are tracked to suppress duplicate assistant confirmations.
## Reply shaping + suppression
- Final payloads are assembled from:
- assistant text (and optional reasoning)
- inline tool summaries (when verbose + allowed)
- assistant error text when the model errors
- `NO_REPLY` is treated as a silent token and filtered from outgoing payloads.
- Messaging tool duplicates are removed from the final payload list.
- If no renderable payloads remain and a tool errored, a fallback tool error reply is emitted
(unless a messaging tool already sent a user-visible reply).
## Compaction + retries
- Auto-compaction emits `compaction` stream events and can trigger a retry.
- On retry, in-memory buffers and tool summaries are reset to avoid duplicate output.
- See [Compaction](/concepts/compaction) for the compaction pipeline.
## Event streams (today)
- `lifecycle`: emitted by `subscribeEmbeddedPiSession` (and as a fallback by `agentCommand`)
- `assistant`: streamed deltas from pi-agent-core
- `tool`: streamed tool events from pi-agent-core
## Chat channel handling
- Assistant deltas are buffered into chat `delta` messages.
- A chat `final` is emitted on **lifecycle end/error**.
## Timeouts
- `agent.wait` default: 30s (just the wait). `timeoutMs` param overrides.
- Agent runtime: `agents.defaults.timeoutSeconds` default 600s; enforced in `runEmbeddedPiAgent` abort timer.
## Where things can end early
- Agent timeout (abort)
- AbortSignal (cancel)
- Gateway disconnect or RPC timeout
- `agent.wait` timeout (wait-only, does not stop agent)

View File

@@ -0,0 +1,231 @@
---
summary: "Agent workspace: location, layout, and backup strategy"
read_when:
- You need to explain the agent workspace or its file layout
- You want to back up or migrate an agent workspace
---
# Agent workspace
The workspace is the agent's home. It is the only working directory used for
file tools and for workspace context. Keep it private and treat it as memory.
This is separate from `~/.clawdbot/`, which stores config, credentials, and
sessions.
**Important:** the workspace is the **default cwd**, not a hard sandbox. Tools
resolve relative paths against the workspace, but absolute paths can still reach
elsewhere on the host unless sandboxing is enabled. If you need isolation, use
[`agents.defaults.sandbox`](/gateway/sandboxing) (and/or peragent sandbox config).
When sandboxing is enabled and `workspaceAccess` is not `"rw"`, tools operate
inside a sandbox workspace under `~/.clawdbot/sandboxes`, not your host workspace.
## Default location
- Default: `~/clawd`
- If `CLAWDBOT_PROFILE` is set and not `"default"`, the default becomes
`~/clawd-<profile>`.
- Override in `~/.clawdbot/moltbot.json`:
```json5
{
agent: {
workspace: "~/clawd"
}
}
```
`moltbot onboard`, `moltbot configure`, or `moltbot setup` will create the
workspace and seed the bootstrap files if they are missing.
If you already manage the workspace files yourself, you can disable bootstrap
file creation:
```json5
{ agent: { skipBootstrap: true } }
```
## Extra workspace folders
Older installs may have created `~/moltbot`. Keeping multiple workspace
directories around can cause confusing auth or state drift, because only one
workspace is active at a time.
**Recommendation:** keep a single active workspace. If you no longer use the
extra folders, archive or move them to Trash (for example `trash ~/moltbot`).
If you intentionally keep multiple workspaces, make sure
`agents.defaults.workspace` points to the active one.
`moltbot doctor` warns when it detects extra workspace directories.
## Workspace file map (what each file means)
These are the standard files Moltbot expects inside the workspace:
- `AGENTS.md`
- Operating instructions for the agent and how it should use memory.
- Loaded at the start of every session.
- Good place for rules, priorities, and "how to behave" details.
- `SOUL.md`
- Persona, tone, and boundaries.
- Loaded every session.
- `USER.md`
- Who the user is and how to address them.
- Loaded every session.
- `IDENTITY.md`
- The agent's name, vibe, and emoji.
- Created/updated during the bootstrap ritual.
- `TOOLS.md`
- Notes about your local tools and conventions.
- Does not control tool availability; it is only guidance.
- `HEARTBEAT.md`
- Optional tiny checklist for heartbeat runs.
- Keep it short to avoid token burn.
- `BOOT.md`
- Optional startup checklist executed on gateway restart when internal hooks are enabled.
- Keep it short; use the message tool for outbound sends.
- `BOOTSTRAP.md`
- One-time first-run ritual.
- Only created for a brand-new workspace.
- Delete it after the ritual is complete.
- `memory/YYYY-MM-DD.md`
- Daily memory log (one file per day).
- Recommended to read today + yesterday on session start.
- `MEMORY.md` (optional)
- Curated long-term memory.
- Only load in the main, private session (not shared/group contexts).
See [Memory](/concepts/memory) for the workflow and automatic memory flush.
- `skills/` (optional)
- Workspace-specific skills.
- Overrides managed/bundled skills when names collide.
- `canvas/` (optional)
- Canvas UI files for node displays (for example `canvas/index.html`).
If any bootstrap file is missing, Moltbot injects a "missing file" marker into
the session and continues. Large bootstrap files are truncated when injected;
adjust the limit with `agents.defaults.bootstrapMaxChars` (default: 20000).
`moltbot setup` can recreate missing defaults without overwriting existing
files.
## What is NOT in the workspace
These live under `~/.clawdbot/` and should NOT be committed to the workspace repo:
- `~/.clawdbot/moltbot.json` (config)
- `~/.clawdbot/credentials/` (OAuth tokens, API keys)
- `~/.clawdbot/agents/<agentId>/sessions/` (session transcripts + metadata)
- `~/.clawdbot/skills/` (managed skills)
If you need to migrate sessions or config, copy them separately and keep them
out of version control.
## Git backup (recommended, private)
Treat the workspace as private memory. Put it in a **private** git repo so it is
backed up and recoverable.
Run these steps on the machine where the Gateway runs (that is where the
workspace lives).
### 1) Initialize the repo
If git is installed, brand-new workspaces are initialized automatically. If this
workspace is not already a repo, run:
```bash
cd ~/clawd
git init
git add AGENTS.md SOUL.md TOOLS.md IDENTITY.md USER.md HEARTBEAT.md memory/
git commit -m "Add agent workspace"
```
### 2) Add a private remote (beginner-friendly options)
Option A: GitHub web UI
1. Create a new **private** repository on GitHub.
2. Do not initialize with a README (avoids merge conflicts).
3. Copy the HTTPS remote URL.
4. Add the remote and push:
```bash
git branch -M main
git remote add origin <https-url>
git push -u origin main
```
Option B: GitHub CLI (`gh`)
```bash
gh auth login
gh repo create clawd-workspace --private --source . --remote origin --push
```
Option C: GitLab web UI
1. Create a new **private** repository on GitLab.
2. Do not initialize with a README (avoids merge conflicts).
3. Copy the HTTPS remote URL.
4. Add the remote and push:
```bash
git branch -M main
git remote add origin <https-url>
git push -u origin main
```
### 3) Ongoing updates
```bash
git status
git add .
git commit -m "Update memory"
git push
```
## Do not commit secrets
Even in a private repo, avoid storing secrets in the workspace:
- API keys, OAuth tokens, passwords, or private credentials.
- Anything under `~/.clawdbot/`.
- Raw dumps of chats or sensitive attachments.
If you must store sensitive references, use placeholders and keep the real
secret elsewhere (password manager, environment variables, or `~/.clawdbot/`).
Suggested `.gitignore` starter:
```gitignore
.DS_Store
.env
**/*.key
**/*.pem
**/secrets*
```
## Moving the workspace to a new machine
1. Clone the repo to the desired path (default `~/clawd`).
2. Set `agents.defaults.workspace` to that path in `~/.clawdbot/moltbot.json`.
3. Run `moltbot setup --workspace <path>` to seed any missing files.
4. If you need sessions, copy `~/.clawdbot/agents/<agentId>/sessions/` from the
old machine separately.
## Advanced notes
- Multi-agent routing can use different workspaces per agent. See
[Channel routing](/concepts/channel-routing) for routing configuration.
- If `agents.defaults.sandbox` is enabled, non-main sessions can use per-session sandbox
workspaces under `agents.defaults.sandbox.workspaceRoot`.

View File

@@ -0,0 +1,117 @@
---
summary: "Agent runtime (embedded p-mono), workspace contract, and session bootstrap"
read_when:
- Changing agent runtime, workspace bootstrap, or session behavior
---
# Agent Runtime 🤖
Moltbot runs a single embedded agent runtime derived from **p-mono**.
## Workspace (required)
Moltbot uses a single agent workspace directory (`agents.defaults.workspace`) as the agents **only** working directory (`cwd`) for tools and context.
Recommended: use `moltbot setup` to create `~/.clawdbot/moltbot.json` if missing and initialize the workspace files.
Full workspace layout + backup guide: [Agent workspace](/concepts/agent-workspace)
If `agents.defaults.sandbox` is enabled, non-main sessions can override this with
per-session workspaces under `agents.defaults.sandbox.workspaceRoot` (see
[Gateway configuration](/gateway/configuration)).
## Bootstrap files (injected)
Inside `agents.defaults.workspace`, Moltbot expects these user-editable files:
- `AGENTS.md` — operating instructions + “memory”
- `SOUL.md` — persona, boundaries, tone
- `TOOLS.md` — user-maintained tool notes (e.g. `imsg`, `sag`, conventions)
- `BOOTSTRAP.md` — one-time first-run ritual (deleted after completion)
- `IDENTITY.md` — agent name/vibe/emoji
- `USER.md` — user profile + preferred address
On the first turn of a new session, Moltbot injects the contents of these files directly into the agent context.
Blank files are skipped. Large files are trimmed and truncated with a marker so prompts stay lean (read the file for full content).
If a file is missing, Moltbot injects a single “missing file” marker line (and `moltbot setup` will create a safe default template).
`BOOTSTRAP.md` is only created for a **brand new workspace** (no other bootstrap files present). If you delete it after completing the ritual, it should not be recreated on later restarts.
To disable bootstrap file creation entirely (for pre-seeded workspaces), set:
```json5
{ agent: { skipBootstrap: true } }
```
## Built-in tools
Core tools (read/exec/edit/write and related system tools) are always available,
subject to tool policy. `apply_patch` is optional and gated by
`tools.exec.applyPatch`. `TOOLS.md` does **not** control which tools exist; its
guidance for how *you* want them used.
## Skills
Moltbot loads skills from three locations (workspace wins on name conflict):
- Bundled (shipped with the install)
- Managed/local: `~/.clawdbot/skills`
- Workspace: `<workspace>/skills`
Skills can be gated by config/env (see `skills` in [Gateway configuration](/gateway/configuration)).
## p-mono integration
Moltbot reuses pieces of the p-mono codebase (models/tools), but **session management, discovery, and tool wiring are Moltbot-owned**.
- No p-coding agent runtime.
- No `~/.pi/agent` or `<workspace>/.pi` settings are consulted.
## Sessions
Session transcripts are stored as JSONL at:
- `~/.clawdbot/agents/<agentId>/sessions/<SessionId>.jsonl`
The session ID is stable and chosen by Moltbot.
Legacy Pi/Tau session folders are **not** read.
## Steering while streaming
When queue mode is `steer`, inbound messages are injected into the current run.
The queue is checked **after each tool call**; if a queued message is present,
remaining tool calls from the current assistant message are skipped (error tool
results with "Skipped due to queued user message."), then the queued user
message is injected before the next assistant response.
When queue mode is `followup` or `collect`, inbound messages are held until the
current turn ends, then a new agent turn starts with the queued payloads. See
[Queue](/concepts/queue) for mode + debounce/cap behavior.
Block streaming sends completed assistant blocks as soon as they finish; it is
**off by default** (`agents.defaults.blockStreamingDefault: "off"`).
Tune the boundary via `agents.defaults.blockStreamingBreak` (`text_end` vs `message_end`; defaults to text_end).
Control soft block chunking with `agents.defaults.blockStreamingChunk` (defaults to
8001200 chars; prefers paragraph breaks, then newlines; sentences last).
Coalesce streamed chunks with `agents.defaults.blockStreamingCoalesce` to reduce
single-line spam (idle-based merging before send). Non-Telegram channels require
explicit `*.blockStreaming: true` to enable block replies.
Verbose tool summaries are emitted at tool start (no debounce); Control UI
streams tool output via agent events when available.
More details: [Streaming + chunking](/concepts/streaming).
## Model refs
Model refs in config (for example `agents.defaults.model` and `agents.defaults.models`) are parsed by splitting on the **first** `/`.
- Use `provider/model` when configuring models.
- If the model ID itself contains `/` (OpenRouter-style), include the provider prefix (example: `openrouter/moonshotai/kimi-k2`).
- If you omit the provider, Moltbot treats the input as an alias or a model for the **default provider** (only works when there is no `/` in the model ID).
## Configuration (minimal)
At minimum, set:
- `agents.defaults.workspace`
- `channels.whatsapp.allowFrom` (strongly recommended)
---
*Next: [Group Chats](/concepts/group-messages)* 🦞

View File

@@ -0,0 +1,122 @@
---
summary: "WebSocket gateway architecture, components, and client flows"
read_when:
- Working on gateway protocol, clients, or transports
---
# Gateway architecture
Last updated: 2026-01-22
## Overview
- A single longlived **Gateway** owns all messaging surfaces (WhatsApp via
Baileys, Telegram via grammY, Slack, Discord, Signal, iMessage, WebChat).
- Control-plane clients (macOS app, CLI, web UI, automations) connect to the
Gateway over **WebSocket** on the configured bind host (default
`127.0.0.1:18789`).
- **Nodes** (macOS/iOS/Android/headless) also connect over **WebSocket**, but
declare `role: node` with explicit caps/commands.
- One Gateway per host; it is the only place that opens a WhatsApp session.
- A **canvas host** (default `18793`) serves agenteditable HTML and A2UI.
## Components and flows
### Gateway (daemon)
- Maintains provider connections.
- Exposes a typed WS API (requests, responses, serverpush events).
- Validates inbound frames against JSON Schema.
- Emits events like `agent`, `chat`, `presence`, `health`, `heartbeat`, `cron`.
### Clients (mac app / CLI / web admin)
- One WS connection per client.
- Send requests (`health`, `status`, `send`, `agent`, `system-presence`).
- Subscribe to events (`tick`, `agent`, `presence`, `shutdown`).
### Nodes (macOS / iOS / Android / headless)
- Connect to the **same WS server** with `role: node`.
- Provide a device identity in `connect`; pairing is **devicebased** (role `node`) and
approval lives in the device pairing store.
- Expose commands like `canvas.*`, `camera.*`, `screen.record`, `location.get`.
Protocol details:
- [Gateway protocol](/gateway/protocol)
### WebChat
- Static UI that uses the Gateway WS API for chat history and sends.
- In remote setups, connects through the same SSH/Tailscale tunnel as other
clients.
## Connection lifecycle (single client)
```
Client Gateway
| |
|---- req:connect -------->|
|<------ res (ok) ---------| (or res error + close)
| (payload=hello-ok carries snapshot: presence + health)
| |
|<------ event:presence ---|
|<------ event:tick -------|
| |
|------- req:agent ------->|
|<------ res:agent --------| (ack: {runId,status:"accepted"})
|<------ event:agent ------| (streaming)
|<------ res:agent --------| (final: {runId,status,summary})
| |
```
## Wire protocol (summary)
- Transport: WebSocket, text frames with JSON payloads.
- First frame **must** be `connect`.
- After handshake:
- Requests: `{type:"req", id, method, params}``{type:"res", id, ok, payload|error}`
- Events: `{type:"event", event, payload, seq?, stateVersion?}`
- If `CLAWDBOT_GATEWAY_TOKEN` (or `--token`) is set, `connect.params.auth.token`
must match or the socket closes.
- Idempotency keys are required for sideeffecting methods (`send`, `agent`) to
safely retry; the server keeps a shortlived dedupe cache.
- Nodes must include `role: "node"` plus caps/commands/permissions in `connect`.
## Pairing + local trust
- All WS clients (operators + nodes) include a **device identity** on `connect`.
- New device IDs require pairing approval; the Gateway issues a **device token**
for subsequent connects.
- **Local** connects (loopback or the gateway hosts own tailnet address) can be
autoapproved to keep samehost UX smooth.
- **Nonlocal** connects must sign the `connect.challenge` nonce and require
explicit approval.
- Gateway auth (`gateway.auth.*`) still applies to **all** connections, local or
remote.
Details: [Gateway protocol](/gateway/protocol), [Pairing](/start/pairing),
[Security](/gateway/security).
## Protocol typing and codegen
- TypeBox schemas define the protocol.
- JSON Schema is generated from those schemas.
- Swift models are generated from the JSON Schema.
## Remote access
- Preferred: Tailscale or VPN.
- Alternative: SSH tunnel
```bash
ssh -N -L 18789:127.0.0.1:18789 user@host
```
- The same handshake + auth token apply over the tunnel.
- TLS + optional pinning can be enabled for WS in remote setups.
## Operations snapshot
- Start: `moltbot gateway` (foreground, logs to stdout).
- Health: `health` over WS (also included in `hello-ok`).
- Supervision: launchd/systemd for autorestart.
## Invariants
- Exactly one Gateway controls a single Baileys session per host.
- Handshake is mandatory; any nonJSON or nonconnect first frame is a hard close.
- Events are not replayed; clients must refresh on gaps.

View File

@@ -0,0 +1,114 @@
---
summary: "Routing rules per channel (WhatsApp, Telegram, Discord, Slack) and shared context"
read_when:
- Changing channel routing or inbox behavior
---
# Channels & routing
Moltbot routes replies **back to the channel where a message came from**. The
model does not choose a channel; routing is deterministic and controlled by the
host configuration.
## Key terms
- **Channel**: `whatsapp`, `telegram`, `discord`, `slack`, `signal`, `imessage`, `webchat`.
- **AccountId**: perchannel account instance (when supported).
- **AgentId**: an isolated workspace + session store (“brain”).
- **SessionKey**: the bucket key used to store context and control concurrency.
## Session key shapes (examples)
Direct messages collapse to the agents **main** session:
- `agent:<agentId>:<mainKey>` (default: `agent:main:main`)
Groups and channels remain isolated per channel:
- Groups: `agent:<agentId>:<channel>:group:<id>`
- Channels/rooms: `agent:<agentId>:<channel>:channel:<id>`
Threads:
- Slack/Discord threads append `:thread:<threadId>` to the base key.
- Telegram forum topics embed `:topic:<topicId>` in the group key.
Examples:
- `agent:main:telegram:group:-1001234567890:topic:42`
- `agent:main:discord:channel:123456:thread:987654`
## Routing rules (how an agent is chosen)
Routing picks **one agent** for each inbound message:
1. **Exact peer match** (`bindings` with `peer.kind` + `peer.id`).
2. **Guild match** (Discord) via `guildId`.
3. **Team match** (Slack) via `teamId`.
4. **Account match** (`accountId` on the channel).
5. **Channel match** (any account on that channel).
6. **Default agent** (`agents.list[].default`, else first list entry, fallback to `main`).
The matched agent determines which workspace and session store are used.
## Broadcast groups (run multiple agents)
Broadcast groups let you run **multiple agents** for the same peer **when Moltbot would normally reply** (for example: in WhatsApp groups, after mention/activation gating).
Config:
```json5
{
broadcast: {
strategy: "parallel",
"120363403215116621@g.us": ["alfred", "baerbel"],
"+15555550123": ["support", "logger"]
}
}
```
See: [Broadcast Groups](/broadcast-groups).
## Config overview
- `agents.list`: named agent definitions (workspace, model, etc.).
- `bindings`: map inbound channels/accounts/peers to agents.
Example:
```json5
{
agents: {
list: [
{ id: "support", name: "Support", workspace: "~/clawd-support" }
]
},
bindings: [
{ match: { channel: "slack", teamId: "T123" }, agentId: "support" },
{ match: { channel: "telegram", peer: { kind: "group", id: "-100123" } }, agentId: "support" }
]
}
```
## Session storage
Session stores live under the state directory (default `~/.clawdbot`):
- `~/.clawdbot/agents/<agentId>/sessions/sessions.json`
- JSONL transcripts live alongside the store
You can override the store path via `session.store` and `{agentId}` templating.
## WebChat behavior
WebChat attaches to the **selected agent** and defaults to the agents main
session. Because of this, WebChat lets you see crosschannel context for that
agent in one place.
## Reply context
Inbound replies include:
- `ReplyToId`, `ReplyToBody`, and `ReplyToSender` when available.
- Quoted context is appended to `Body` as a `[Replying to ...]` block.
This is consistent across channels.

View File

@@ -0,0 +1,49 @@
---
summary: "Context window + compaction: how Moltbot keeps sessions under model limits"
read_when:
- You want to understand auto-compaction and /compact
- You are debugging long sessions hitting context limits
---
# Context Window & Compaction
Every model has a **context window** (max tokens it can see). Long-running chats accumulate messages and tool results; once the window is tight, Moltbot **compacts** older history to stay within limits.
## What compaction is
Compaction **summarizes older conversation** into a compact summary entry and keeps recent messages intact. The summary is stored in the session history, so future requests use:
- The compaction summary
- Recent messages after the compaction point
Compaction **persists** in the sessions JSONL history.
## Configuration
See [Compaction config & modes](/concepts/compaction) for the `agents.defaults.compaction` settings.
## Auto-compaction (default on)
When a session nears or exceeds the models context window, Moltbot triggers auto-compaction and may retry the original request using the compacted context.
Youll see:
- `🧹 Auto-compaction complete` in verbose mode
- `/status` showing `🧹 Compactions: <count>`
Before compaction, Moltbot can run a **silent memory flush** turn to store
durable notes to disk. See [Memory](/concepts/memory) for details and config.
## Manual compaction
Use `/compact` (optionally with instructions) to force a compaction pass:
```
/compact Focus on decisions and open questions
```
## Context window source
Context window is model-specific. Moltbot uses the model definition from the configured provider catalog to determine limits.
## Compaction vs pruning
- **Compaction**: summarises and **persists** in JSONL.
- **Session pruning**: trims old **tool results** only, **in-memory**, per request.
See [/concepts/session-pruning](/concepts/session-pruning) for pruning details.
## Tips
- Use `/compact` when sessions feel stale or context is bloated.
- Large tool outputs are already truncated; pruning can further reduce tool-result buildup.
- If you need a fresh slate, `/new` or `/reset` starts a new session id.

View File

@@ -0,0 +1,151 @@
---
summary: "Context: what the model sees, how it is built, and how to inspect it"
read_when:
- You want to understand what “context” means in Moltbot
- You are debugging why the model “knows” something (or forgot it)
- You want to reduce context overhead (/context, /status, /compact)
---
# Context
“Context” is **everything Moltbot sends to the model for a run**. It is bounded by the models **context window** (token limit).
Beginner mental model:
- **System prompt** (Moltbot-built): rules, tools, skills list, time/runtime, and injected workspace files.
- **Conversation history**: your messages + the assistants messages for this session.
- **Tool calls/results + attachments**: command output, file reads, images/audio, etc.
Context is *not the same thing* as “memory”: memory can be stored on disk and reloaded later; context is whats inside the models current window.
## Quick start (inspect context)
- `/status` → quick “how full is my window?” view + session settings.
- `/context list` → whats injected + rough sizes (per file + totals).
- `/context detail` → deeper breakdown: per-file, per-tool schema sizes, per-skill entry sizes, and system prompt size.
- `/usage tokens` → append per-reply usage footer to normal replies.
- `/compact` → summarize older history into a compact entry to free window space.
See also: [Slash commands](/tools/slash-commands), [Token use & costs](/token-use), [Compaction](/concepts/compaction).
## Example output
Values vary by model, provider, tool policy, and whats in your workspace.
### `/context list`
```
🧠 Context breakdown
Workspace: <workspaceDir>
Bootstrap max/file: 20,000 chars
Sandbox: mode=non-main sandboxed=false
System prompt (run): 38,412 chars (~9,603 tok) (Project Context 23,901 chars (~5,976 tok))
Injected workspace files:
- AGENTS.md: OK | raw 1,742 chars (~436 tok) | injected 1,742 chars (~436 tok)
- SOUL.md: OK | raw 912 chars (~228 tok) | injected 912 chars (~228 tok)
- TOOLS.md: TRUNCATED | raw 54,210 chars (~13,553 tok) | injected 20,962 chars (~5,241 tok)
- IDENTITY.md: OK | raw 211 chars (~53 tok) | injected 211 chars (~53 tok)
- USER.md: OK | raw 388 chars (~97 tok) | injected 388 chars (~97 tok)
- HEARTBEAT.md: MISSING | raw 0 | injected 0
- BOOTSTRAP.md: OK | raw 0 chars (~0 tok) | injected 0 chars (~0 tok)
Skills list (system prompt text): 2,184 chars (~546 tok) (12 skills)
Tools: read, edit, write, exec, process, browser, message, sessions_send, …
Tool list (system prompt text): 1,032 chars (~258 tok)
Tool schemas (JSON): 31,988 chars (~7,997 tok) (counts toward context; not shown as text)
Tools: (same as above)
Session tokens (cached): 14,250 total / ctx=32,000
```
### `/context detail`
```
🧠 Context breakdown (detailed)
Top skills (prompt entry size):
- frontend-design: 412 chars (~103 tok)
- oracle: 401 chars (~101 tok)
… (+10 more skills)
Top tools (schema size):
- browser: 9,812 chars (~2,453 tok)
- exec: 6,240 chars (~1,560 tok)
… (+N more tools)
```
## What counts toward the context window
Everything the model receives counts, including:
- System prompt (all sections).
- Conversation history.
- Tool calls + tool results.
- Attachments/transcripts (images/audio/files).
- Compaction summaries and pruning artifacts.
- Provider “wrappers” or hidden headers (not visible, still counted).
## How Moltbot builds the system prompt
The system prompt is **Moltbot-owned** and rebuilt each run. It includes:
- Tool list + short descriptions.
- Skills list (metadata only; see below).
- Workspace location.
- Time (UTC + converted user time if configured).
- Runtime metadata (host/OS/model/thinking).
- Injected workspace bootstrap files under **Project Context**.
Full breakdown: [System Prompt](/concepts/system-prompt).
## Injected workspace files (Project Context)
By default, Moltbot injects a fixed set of workspace files (if present):
- `AGENTS.md`
- `SOUL.md`
- `TOOLS.md`
- `IDENTITY.md`
- `USER.md`
- `HEARTBEAT.md`
- `BOOTSTRAP.md` (first-run only)
Large files are truncated per-file using `agents.defaults.bootstrapMaxChars` (default `20000` chars). `/context` shows **raw vs injected** sizes and whether truncation happened.
## Skills: whats injected vs loaded on-demand
The system prompt includes a compact **skills list** (name + description + location). This list has real overhead.
Skill instructions are *not* included by default. The model is expected to `read` the skills `SKILL.md` **only when needed**.
## Tools: there are two costs
Tools affect context in two ways:
1) **Tool list text** in the system prompt (what you see as “Tooling”).
2) **Tool schemas** (JSON). These are sent to the model so it can call tools. They count toward context even though you dont see them as plain text.
`/context detail` breaks down the biggest tool schemas so you can see what dominates.
## Commands, directives, and “inline shortcuts”
Slash commands are handled by the Gateway. There are a few different behaviors:
- **Standalone commands**: a message that is only `/...` runs as a command.
- **Directives**: `/think`, `/verbose`, `/reasoning`, `/elevated`, `/model`, `/queue` are stripped before the model sees the message.
- Directive-only messages persist session settings.
- Inline directives in a normal message act as per-message hints.
- **Inline shortcuts** (allowlisted senders only): certain `/...` tokens inside a normal message can run immediately (example: “hey /status”), and are stripped before the model sees the remaining text.
Details: [Slash commands](/tools/slash-commands).
## Sessions, compaction, and pruning (what persists)
What persists across messages depends on the mechanism:
- **Normal history** persists in the session transcript until compacted/pruned by policy.
- **Compaction** persists a summary into the transcript and keeps recent messages intact.
- **Pruning** removes old tool results from the *in-memory* prompt for a run, but does not rewrite the transcript.
Docs: [Session](/concepts/session), [Compaction](/concepts/compaction), [Session pruning](/concepts/session-pruning).
## What `/context` actually reports
`/context` prefers the latest **run-built** system prompt report when available:
- `System prompt (run)` = captured from the last embedded (tool-capable) run and persisted in the session store.
- `System prompt (estimate)` = computed on the fly when no run report exists (or when running via a CLI backend that doesnt generate the report).
Either way, it reports sizes and top contributors; it does **not** dump the full system prompt or tool schemas.

View File

@@ -0,0 +1,78 @@
---
summary: "Behavior and config for WhatsApp group message handling (mentionPatterns are shared across surfaces)"
read_when:
- Changing group message rules or mentions
---
# Group messages (WhatsApp web channel)
Goal: let Clawd sit in WhatsApp groups, wake up only when pinged, and keep that thread separate from the personal DM session.
Note: `agents.list[].groupChat.mentionPatterns` is now used by Telegram/Discord/Slack/iMessage as well; this doc focuses on WhatsApp-specific behavior. For multi-agent setups, set `agents.list[].groupChat.mentionPatterns` per agent (or use `messages.groupChat.mentionPatterns` as a global fallback).
## Whats implemented (2025-12-03)
- Activation modes: `mention` (default) or `always`. `mention` requires a ping (real WhatsApp @-mentions via `mentionedJids`, regex patterns, or the bots E.164 anywhere in the text). `always` wakes the agent on every message but it should reply only when it can add meaningful value; otherwise it returns the silent token `NO_REPLY`. Defaults can be set in config (`channels.whatsapp.groups`) and overridden per group via `/activation`. When `channels.whatsapp.groups` is set, it also acts as a group allowlist (include `"*"` to allow all).
- Group policy: `channels.whatsapp.groupPolicy` controls whether group messages are accepted (`open|disabled|allowlist`). `allowlist` uses `channels.whatsapp.groupAllowFrom` (fallback: explicit `channels.whatsapp.allowFrom`). Default is `allowlist` (blocked until you add senders).
- Per-group sessions: session keys look like `agent:<agentId>:whatsapp:group:<jid>` so commands such as `/verbose on` or `/think high` (sent as standalone messages) are scoped to that group; personal DM state is untouched. Heartbeats are skipped for group threads.
- Context injection: **pending-only** group messages (default 50) that *did not* trigger a run are prefixed under `[Chat messages since your last reply - for context]`, with the triggering line under `[Current message - respond to this]`. Messages already in the session are not re-injected.
- Sender surfacing: every group batch now ends with `[from: Sender Name (+E164)]` so Pi knows who is speaking.
- Ephemeral/view-once: we unwrap those before extracting text/mentions, so pings inside them still trigger.
- Group system prompt: on the first turn of a group session (and whenever `/activation` changes the mode) we inject a short blurb into the system prompt like `You are replying inside the WhatsApp group "<subject>". Group members: Alice (+44...), Bob (+43...), … Activation: trigger-only … Address the specific sender noted in the message context.` If metadata isnt available we still tell the agent its a group chat.
## Config example (WhatsApp)
Add a `groupChat` block to `~/.clawdbot/moltbot.json` so display-name pings work even when WhatsApp strips the visual `@` in the text body:
```json5
{
channels: {
whatsapp: {
groups: {
"*": { requireMention: true }
}
}
},
agents: {
list: [
{
id: "main",
groupChat: {
historyLimit: 50,
mentionPatterns: [
"@?moltbot",
"\\+?15555550123"
]
}
}
]
}
}
```
Notes:
- The regexes are case-insensitive; they cover a display-name ping like `@moltbot` and the raw number with or without `+`/spaces.
- WhatsApp still sends canonical mentions via `mentionedJids` when someone taps the contact, so the number fallback is rarely needed but is a useful safety net.
### Activation command (owner-only)
Use the group chat command:
- `/activation mention`
- `/activation always`
Only the owner number (from `channels.whatsapp.allowFrom`, or the bots own E.164 when unset) can change this. Send `/status` as a standalone message in the group to see the current activation mode.
## How to use
1) Add your WhatsApp account (the one running Moltbot) to the group.
2) Say `@moltbot …` (or include the number). Only allowlisted senders can trigger it unless you set `groupPolicy: "open"`.
3) The agent prompt will include recent group context plus the trailing `[from: …]` marker so it can address the right person.
4) Session-level directives (`/verbose on`, `/think high`, `/new` or `/reset`, `/compact`) apply only to that groups session; send them as standalone messages so they register. Your personal DM session remains independent.
## Testing / verification
- Manual smoke:
- Send an `@clawd` ping in the group and confirm a reply that references the sender name.
- Send a second ping and verify the history block is included then cleared on the next turn.
- Check gateway logs (run with `--verbose`) to see `inbound web message` entries showing `from: <groupJid>` and the `[from: …]` suffix.
## Known considerations
- Heartbeats are intentionally skipped for groups to avoid noisy broadcasts.
- Echo suppression uses the combined batch string; if you send identical text twice without mentions, only the first will get a response.
- Session store entries will appear as `agent:<agentId>:whatsapp:group:<jid>` in the session store (`~/.clawdbot/agents/<agentId>/sessions/sessions.json` by default); a missing entry just means the group hasnt triggered a run yet.
- Typing indicators in groups follow `agents.defaults.typingMode` (default: `message` when unmentioned).

View File

@@ -0,0 +1,344 @@
---
summary: "Group chat behavior across surfaces (WhatsApp/Telegram/Discord/Slack/Signal/iMessage/Microsoft Teams)"
read_when:
- Changing group chat behavior or mention gating
---
# Groups
Moltbot treats group chats consistently across surfaces: WhatsApp, Telegram, Discord, Slack, Signal, iMessage, Microsoft Teams.
## Beginner intro (2 minutes)
Moltbot “lives” on your own messaging accounts. There is no separate WhatsApp bot user.
If **you** are in a group, Moltbot can see that group and respond there.
Default behavior:
- Groups are restricted (`groupPolicy: "allowlist"`).
- Replies require a mention unless you explicitly disable mention gating.
Translation: allowlisted senders can trigger Moltbot by mentioning it.
> TL;DR
> - **DM access** is controlled by `*.allowFrom`.
> - **Group access** is controlled by `*.groupPolicy` + allowlists (`*.groups`, `*.groupAllowFrom`).
> - **Reply triggering** is controlled by mention gating (`requireMention`, `/activation`).
Quick flow (what happens to a group message):
```
groupPolicy? disabled -> drop
groupPolicy? allowlist -> group allowed? no -> drop
requireMention? yes -> mentioned? no -> store for context only
otherwise -> reply
```
![Group message flow](/images/groups-flow.svg)
If you want...
| Goal | What to set |
|------|-------------|
| Allow all groups but only reply on @mentions | `groups: { "*": { requireMention: true } }` |
| Disable all group replies | `groupPolicy: "disabled"` |
| Only specific groups | `groups: { "<group-id>": { ... } }` (no `"*"` key) |
| Only you can trigger in groups | `groupPolicy: "allowlist"`, `groupAllowFrom: ["+1555..."]` |
## Session keys
- Group sessions use `agent:<agentId>:<channel>:group:<id>` session keys (rooms/channels use `agent:<agentId>:<channel>:channel:<id>`).
- Telegram forum topics add `:topic:<threadId>` to the group id so each topic has its own session.
- Direct chats use the main session (or per-sender if configured).
- Heartbeats are skipped for group sessions.
## Pattern: personal DMs + public groups (single agent)
Yes — this works well if your “personal” traffic is **DMs** and your “public” traffic is **groups**.
Why: in single-agent mode, DMs typically land in the **main** session key (`agent:main:main`), while groups always use **non-main** session keys (`agent:main:<channel>:group:<id>`). If you enable sandboxing with `mode: "non-main"`, those group sessions run in Docker while your main DM session stays on-host.
This gives you one agent “brain” (shared workspace + memory), but two execution postures:
- **DMs**: full tools (host)
- **Groups**: sandbox + restricted tools (Docker)
> If you need truly separate workspaces/personas (“personal” and “public” must never mix), use a second agent + bindings. See [Multi-Agent Routing](/concepts/multi-agent).
Example (DMs on host, groups sandboxed + messaging-only tools):
```json5
{
agents: {
defaults: {
sandbox: {
mode: "non-main", // groups/channels are non-main -> sandboxed
scope: "session", // strongest isolation (one container per group/channel)
workspaceAccess: "none"
}
}
},
tools: {
sandbox: {
tools: {
// If allow is non-empty, everything else is blocked (deny still wins).
allow: ["group:messaging", "group:sessions"],
deny: ["group:runtime", "group:fs", "group:ui", "nodes", "cron", "gateway"]
}
}
}
}
```
Want “groups can only see folder X” instead of “no host access”? Keep `workspaceAccess: "none"` and mount only allowlisted paths into the sandbox:
```json5
{
agents: {
defaults: {
sandbox: {
mode: "non-main",
scope: "session",
workspaceAccess: "none",
docker: {
binds: [
// hostPath:containerPath:mode
"~/FriendsShared:/data:ro"
]
}
}
}
}
}
```
Related:
- Configuration keys and defaults: [Gateway configuration](/gateway/configuration#agentsdefaultssandbox)
- Debugging why a tool is blocked: [Sandbox vs Tool Policy vs Elevated](/gateway/sandbox-vs-tool-policy-vs-elevated)
- Bind mounts details: [Sandboxing](/gateway/sandboxing#custom-bind-mounts)
## Display labels
- UI labels use `displayName` when available, formatted as `<channel>:<token>`.
- `#room` is reserved for rooms/channels; group chats use `g-<slug>` (lowercase, spaces -> `-`, keep `#@+._-`).
## Group policy
Control how group/room messages are handled per channel:
```json5
{
channels: {
whatsapp: {
groupPolicy: "disabled", // "open" | "disabled" | "allowlist"
groupAllowFrom: ["+15551234567"]
},
telegram: {
groupPolicy: "disabled",
groupAllowFrom: ["123456789", "@username"]
},
signal: {
groupPolicy: "disabled",
groupAllowFrom: ["+15551234567"]
},
imessage: {
groupPolicy: "disabled",
groupAllowFrom: ["chat_id:123"]
},
msteams: {
groupPolicy: "disabled",
groupAllowFrom: ["user@org.com"]
},
discord: {
groupPolicy: "allowlist",
guilds: {
"GUILD_ID": { channels: { help: { allow: true } } }
}
},
slack: {
groupPolicy: "allowlist",
channels: { "#general": { allow: true } }
},
matrix: {
groupPolicy: "allowlist",
groupAllowFrom: ["@owner:example.org"],
groups: {
"!roomId:example.org": { allow: true },
"#alias:example.org": { allow: true }
}
}
}
}
```
| Policy | Behavior |
|--------|----------|
| `"open"` | Groups bypass allowlists; mention-gating still applies. |
| `"disabled"` | Block all group messages entirely. |
| `"allowlist"` | Only allow groups/rooms that match the configured allowlist. |
Notes:
- `groupPolicy` is separate from mention-gating (which requires @mentions).
- WhatsApp/Telegram/Signal/iMessage/Microsoft Teams: use `groupAllowFrom` (fallback: explicit `allowFrom`).
- Discord: allowlist uses `channels.discord.guilds.<id>.channels`.
- Slack: allowlist uses `channels.slack.channels`.
- Matrix: allowlist uses `channels.matrix.groups` (room IDs, aliases, or names). Use `channels.matrix.groupAllowFrom` to restrict senders; per-room `users` allowlists are also supported.
- Group DMs are controlled separately (`channels.discord.dm.*`, `channels.slack.dm.*`).
- Telegram allowlist can match user IDs (`"123456789"`, `"telegram:123456789"`, `"tg:123456789"`) or usernames (`"@alice"` or `"alice"`); prefixes are case-insensitive.
- Default is `groupPolicy: "allowlist"`; if your group allowlist is empty, group messages are blocked.
Quick mental model (evaluation order for group messages):
1) `groupPolicy` (open/disabled/allowlist)
2) group allowlists (`*.groups`, `*.groupAllowFrom`, channel-specific allowlist)
3) mention gating (`requireMention`, `/activation`)
## Mention gating (default)
Group messages require a mention unless overridden per group. Defaults live per subsystem under `*.groups."*"`.
Replying to a bot message counts as an implicit mention (when the channel supports reply metadata). This applies to Telegram, WhatsApp, Slack, Discord, and Microsoft Teams.
```json5
{
channels: {
whatsapp: {
groups: {
"*": { requireMention: true },
"123@g.us": { requireMention: false }
}
},
telegram: {
groups: {
"*": { requireMention: true },
"123456789": { requireMention: false }
}
},
imessage: {
groups: {
"*": { requireMention: true },
"123": { requireMention: false }
}
}
},
agents: {
list: [
{
id: "main",
groupChat: {
mentionPatterns: ["@clawd", "moltbot", "\\+15555550123"],
historyLimit: 50
}
}
]
}
}
```
Notes:
- `mentionPatterns` are case-insensitive regexes.
- Surfaces that provide explicit mentions still pass; patterns are a fallback.
- Per-agent override: `agents.list[].groupChat.mentionPatterns` (useful when multiple agents share a group).
- Mention gating is only enforced when mention detection is possible (native mentions or `mentionPatterns` are configured).
- Discord defaults live in `channels.discord.guilds."*"` (overridable per guild/channel).
- Group history context is wrapped uniformly across channels and is **pending-only** (messages skipped due to mention gating); use `messages.groupChat.historyLimit` for the global default and `channels.<channel>.historyLimit` (or `channels.<channel>.accounts.*.historyLimit`) for overrides. Set `0` to disable.
## Group/channel tool restrictions (optional)
Some channel configs support restricting which tools are available **inside a specific group/room/channel**.
- `tools`: allow/deny tools for the whole group.
- `toolsBySender`: per-sender overrides within the group (keys are sender IDs/usernames/emails/phone numbers depending on the channel). Use `"*"` as a wildcard.
Resolution order (most specific wins):
1) group/channel `toolsBySender` match
2) group/channel `tools`
3) default (`"*"`) `toolsBySender` match
4) default (`"*"`) `tools`
Example (Telegram):
```json5
{
channels: {
telegram: {
groups: {
"*": { tools: { deny: ["exec"] } },
"-1001234567890": {
tools: { deny: ["exec", "read", "write"] },
toolsBySender: {
"123456789": { alsoAllow: ["exec"] }
}
}
}
}
}
}
```
Notes:
- Group/channel tool restrictions are applied in addition to global/agent tool policy (deny still wins).
- Some channels use different nesting for rooms/channels (e.g., Discord `guilds.*.channels.*`, Slack `channels.*`, MS Teams `teams.*.channels.*`).
## Group allowlists
When `channels.whatsapp.groups`, `channels.telegram.groups`, or `channels.imessage.groups` is configured, the keys act as a group allowlist. Use `"*"` to allow all groups while still setting default mention behavior.
Common intents (copy/paste):
1) Disable all group replies
```json5
{
channels: { whatsapp: { groupPolicy: "disabled" } }
}
```
2) Allow only specific groups (WhatsApp)
```json5
{
channels: {
whatsapp: {
groups: {
"123@g.us": { requireMention: true },
"456@g.us": { requireMention: false }
}
}
}
}
```
3) Allow all groups but require mention (explicit)
```json5
{
channels: {
whatsapp: {
groups: { "*": { requireMention: true } }
}
}
}
```
4) Only the owner can trigger in groups (WhatsApp)
```json5
{
channels: {
whatsapp: {
groupPolicy: "allowlist",
groupAllowFrom: ["+15551234567"],
groups: { "*": { requireMention: true } }
}
}
}
```
## Activation (owner-only)
Group owners can toggle per-group activation:
- `/activation mention`
- `/activation always`
Owner is determined by `channels.whatsapp.allowFrom` (or the bots self E.164 when unset). Send the command as a standalone message. Other surfaces currently ignore `/activation`.
## Context fields
Group inbound payloads set:
- `ChatType=group`
- `GroupSubject` (if known)
- `GroupMembers` (if known)
- `WasMentioned` (mention gating result)
- Telegram forum topics also include `MessageThreadId` and `IsForum`.
The agent system prompt includes a group intro on the first turn of a new group session. It reminds the model to respond like a human, avoid Markdown tables, and avoid typing literal `\n` sequences.
## iMessage specifics
- Prefer `chat_id:<id>` when routing or allowlisting.
- List chats: `imsg chats --limit 20`.
- Group replies always go back to the same `chat_id`.
## WhatsApp specifics
See [Group messages](/concepts/group-messages) for WhatsApp-only behavior (history injection, mention handling details).

View File

@@ -0,0 +1,132 @@
---
summary: "Markdown formatting pipeline for outbound channels"
read_when:
- You are changing markdown formatting or chunking for outbound channels
- You are adding a new channel formatter or style mapping
- You are debugging formatting regressions across channels
---
# Markdown formatting
Moltbot formats outbound Markdown by converting it into a shared intermediate
representation (IR) before rendering channel-specific output. The IR keeps the
source text intact while carrying style/link spans so chunking and rendering can
stay consistent across channels.
## Goals
- **Consistency:** one parse step, multiple renderers.
- **Safe chunking:** split text before rendering so inline formatting never
breaks across chunks.
- **Channel fit:** map the same IR to Slack mrkdwn, Telegram HTML, and Signal
style ranges without re-parsing Markdown.
## Pipeline
1. **Parse Markdown -> IR**
- IR is plain text plus style spans (bold/italic/strike/code/spoiler) and link spans.
- Offsets are UTF-16 code units so Signal style ranges align with its API.
- Tables are parsed only when a channel opts into table conversion.
2. **Chunk IR (format-first)**
- Chunking happens on the IR text before rendering.
- Inline formatting does not split across chunks; spans are sliced per chunk.
3. **Render per channel**
- **Slack:** mrkdwn tokens (bold/italic/strike/code), links as `<url|label>`.
- **Telegram:** HTML tags (`<b>`, `<i>`, `<s>`, `<code>`, `<pre><code>`, `<a href>`).
- **Signal:** plain text + `text-style` ranges; links become `label (url)` when label differs.
## IR example
Input Markdown:
```markdown
Hello **world** — see [docs](https://docs.molt.bot).
```
IR (schematic):
```json
{
"text": "Hello world — see docs.",
"styles": [
{ "start": 6, "end": 11, "style": "bold" }
],
"links": [
{ "start": 19, "end": 23, "href": "https://docs.molt.bot" }
]
}
```
## Where it is used
- Slack, Telegram, and Signal outbound adapters render from the IR.
- Other channels (WhatsApp, iMessage, MS Teams, Discord) still use plain text or
their own formatting rules, with Markdown table conversion applied before
chunking when enabled.
## Table handling
Markdown tables are not consistently supported across chat clients. Use
`markdown.tables` to control conversion per channel (and per account).
- `code`: render tables as code blocks (default for most channels).
- `bullets`: convert each row into bullet points (default for Signal + WhatsApp).
- `off`: disable table parsing and conversion; raw table text passes through.
Config keys:
```yaml
channels:
discord:
markdown:
tables: code
accounts:
work:
markdown:
tables: off
```
## Chunking rules
- Chunk limits come from channel adapters/config and are applied to the IR text.
- Code fences are preserved as a single block with a trailing newline so channels
render them correctly.
- List prefixes and blockquote prefixes are part of the IR text, so chunking
does not split mid-prefix.
- Inline styles (bold/italic/strike/inline-code/spoiler) are never split across
chunks; the renderer reopens styles inside each chunk.
If you need more on chunking behavior across channels, see
[Streaming + chunking](/concepts/streaming).
## Link policy
- **Slack:** `[label](url)` -> `<url|label>`; bare URLs remain bare. Autolink
is disabled during parse to avoid double-linking.
- **Telegram:** `[label](url)` -> `<a href="url">label</a>` (HTML parse mode).
- **Signal:** `[label](url)` -> `label (url)` unless label matches the URL.
## Spoilers
Spoiler markers (`||spoiler||`) are parsed only for Signal, where they map to
SPOILER style ranges. Other channels treat them as plain text.
## How to add or update a channel formatter
1. **Parse once:** use the shared `markdownToIR(...)` helper with channel-appropriate
options (autolink, heading style, blockquote prefix).
2. **Render:** implement a renderer with `renderMarkdownWithMarkers(...)` and a
style marker map (or Signal style ranges).
3. **Chunk:** call `chunkMarkdownIR(...)` before rendering; render each chunk.
4. **Wire adapter:** update the channel outbound adapter to use the new chunker
and renderer.
5. **Test:** add or update format tests and an outbound delivery test if the
channel uses chunking.
## Common gotchas
- Slack angle-bracket tokens (`<@U123>`, `<#C123>`, `<https://...>`) must be
preserved; escape raw HTML safely.
- Telegram HTML requires escaping text outside tags to avoid broken markup.
- Signal style ranges depend on UTF-16 offsets; do not use code point offsets.
- Preserve trailing newlines for fenced code blocks so closing markers land on
their own line.

View File

@@ -0,0 +1,388 @@
---
summary: "How Moltbot memory works (workspace files + automatic memory flush)"
read_when:
- You want the memory file layout and workflow
- You want to tune the automatic pre-compaction memory flush
---
# Memory
Moltbot memory is **plain Markdown in the agent workspace**. The files are the
source of truth; the model only "remembers" what gets written to disk.
Memory search tools are provided by the active memory plugin (default:
`memory-core`). Disable memory plugins with `plugins.slots.memory = "none"`.
## Memory files (Markdown)
The default workspace layout uses two memory layers:
- `memory/YYYY-MM-DD.md`
- Daily log (append-only).
- Read today + yesterday at session start.
- `MEMORY.md` (optional)
- Curated long-term memory.
- **Only load in the main, private session** (never in group contexts).
These files live under the workspace (`agents.defaults.workspace`, default
`~/clawd`). See [Agent workspace](/concepts/agent-workspace) for the full layout.
## When to write memory
- Decisions, preferences, and durable facts go to `MEMORY.md`.
- Day-to-day notes and running context go to `memory/YYYY-MM-DD.md`.
- If someone says "remember this," write it down (do not keep it in RAM).
- This area is still evolving. It helps to remind the model to store memories; it will know what to do.
- If you want something to stick, **ask the bot to write it** into memory.
## Automatic memory flush (pre-compaction ping)
When a session is **close to auto-compaction**, Moltbot triggers a **silent,
agentic turn** that reminds the model to write durable memory **before** the
context is compacted. The default prompts explicitly say the model *may reply*,
but usually `NO_REPLY` is the correct response so the user never sees this turn.
This is controlled by `agents.defaults.compaction.memoryFlush`:
```json5
{
agents: {
defaults: {
compaction: {
reserveTokensFloor: 20000,
memoryFlush: {
enabled: true,
softThresholdTokens: 4000,
systemPrompt: "Session nearing compaction. Store durable memories now.",
prompt: "Write any lasting notes to memory/YYYY-MM-DD.md; reply with NO_REPLY if nothing to store."
}
}
}
}
}
```
Details:
- **Soft threshold**: flush triggers when the session token estimate crosses
`contextWindow - reserveTokensFloor - softThresholdTokens`.
- **Silent** by default: prompts include `NO_REPLY` so nothing is delivered.
- **Two prompts**: a user prompt plus a system prompt append the reminder.
- **One flush per compaction cycle** (tracked in `sessions.json`).
- **Workspace must be writable**: if the session runs sandboxed with
`workspaceAccess: "ro"` or `"none"`, the flush is skipped.
For the full compaction lifecycle, see
[Session management + compaction](/reference/session-management-compaction).
## Vector memory search
Moltbot can build a small vector index over `MEMORY.md` and `memory/*.md` so
semantic queries can find related notes even when wording differs.
Defaults:
- Enabled by default.
- Watches memory files for changes (debounced).
- Uses remote embeddings by default. If `memorySearch.provider` is not set, Moltbot auto-selects:
1. `local` if a `memorySearch.local.modelPath` is configured and the file exists.
2. `openai` if an OpenAI key can be resolved.
3. `gemini` if a Gemini key can be resolved.
4. Otherwise memory search stays disabled until configured.
- Local mode uses node-llama-cpp and may require `pnpm approve-builds`.
- Uses sqlite-vec (when available) to accelerate vector search inside SQLite.
Remote embeddings **require** an API key for the embedding provider. Moltbot
resolves keys from auth profiles, `models.providers.*.apiKey`, or environment
variables. Codex OAuth only covers chat/completions and does **not** satisfy
embeddings for memory search. For Gemini, use `GEMINI_API_KEY` or
`models.providers.google.apiKey`. When using a custom OpenAI-compatible endpoint,
set `memorySearch.remote.apiKey` (and optional `memorySearch.remote.headers`).
### Gemini embeddings (native)
Set the provider to `gemini` to use the Gemini embeddings API directly:
```json5
agents: {
defaults: {
memorySearch: {
provider: "gemini",
model: "gemini-embedding-001",
remote: {
apiKey: "YOUR_GEMINI_API_KEY"
}
}
}
}
```
Notes:
- `remote.baseUrl` is optional (defaults to the Gemini API base URL).
- `remote.headers` lets you add extra headers if needed.
- Default model: `gemini-embedding-001`.
If you want to use a **custom OpenAI-compatible endpoint** (OpenRouter, vLLM, or a proxy),
you can use the `remote` configuration with the OpenAI provider:
```json5
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
remote: {
baseUrl: "https://api.example.com/v1/",
apiKey: "YOUR_OPENAI_COMPAT_API_KEY",
headers: { "X-Custom-Header": "value" }
}
}
}
}
```
If you don't want to set an API key, use `memorySearch.provider = "local"` or set
`memorySearch.fallback = "none"`.
Fallbacks:
- `memorySearch.fallback` can be `openai`, `gemini`, `local`, or `none`.
- The fallback provider is only used when the primary embedding provider fails.
Batch indexing (OpenAI + Gemini):
- Enabled by default for OpenAI and Gemini embeddings. Set `agents.defaults.memorySearch.remote.batch.enabled = false` to disable.
- Default behavior waits for batch completion; tune `remote.batch.wait`, `remote.batch.pollIntervalMs`, and `remote.batch.timeoutMinutes` if needed.
- Set `remote.batch.concurrency` to control how many batch jobs we submit in parallel (default: 2).
- Batch mode applies when `memorySearch.provider = "openai"` or `"gemini"` and uses the corresponding API key.
- Gemini batch jobs use the async embeddings batch endpoint and require Gemini Batch API availability.
Why OpenAI batch is fast + cheap:
- For large backfills, OpenAI is typically the fastest option we support because we can submit many embedding requests in a single batch job and let OpenAI process them asynchronously.
- OpenAI offers discounted pricing for Batch API workloads, so large indexing runs are usually cheaper than sending the same requests synchronously.
- See the OpenAI Batch API docs and pricing for details:
- https://platform.openai.com/docs/api-reference/batch
- https://platform.openai.com/pricing
Config example:
```json5
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
fallback: "openai",
remote: {
batch: { enabled: true, concurrency: 2 }
},
sync: { watch: true }
}
}
}
```
Tools:
- `memory_search` — returns snippets with file + line ranges.
- `memory_get` — read memory file content by path.
Local mode:
- Set `agents.defaults.memorySearch.provider = "local"`.
- Provide `agents.defaults.memorySearch.local.modelPath` (GGUF or `hf:` URI).
- Optional: set `agents.defaults.memorySearch.fallback = "none"` to avoid remote fallback.
### How the memory tools work
- `memory_search` semantically searches Markdown chunks (~400 token target, 80-token overlap) from `MEMORY.md` + `memory/**/*.md`. It returns snippet text (capped ~700 chars), file path, line range, score, provider/model, and whether we fell back from local → remote embeddings. No full file payload is returned.
- `memory_get` reads a specific memory Markdown file (workspace-relative), optionally from a starting line and for N lines. Paths outside `MEMORY.md` / `memory/` are rejected.
- Both tools are enabled only when `memorySearch.enabled` resolves true for the agent.
### What gets indexed (and when)
- File type: Markdown only (`MEMORY.md`, `memory/**/*.md`).
- Index storage: per-agent SQLite at `~/.clawdbot/memory/<agentId>.sqlite` (configurable via `agents.defaults.memorySearch.store.path`, supports `{agentId}` token).
- Freshness: watcher on `MEMORY.md` + `memory/` marks the index dirty (debounce 1.5s). Sync is scheduled on session start, on search, or on an interval and runs asynchronously. Session transcripts use delta thresholds to trigger background sync.
- Reindex triggers: the index stores the embedding **provider/model + endpoint fingerprint + chunking params**. If any of those change, Moltbot automatically resets and reindexes the entire store.
### Hybrid search (BM25 + vector)
When enabled, Moltbot combines:
- **Vector similarity** (semantic match, wording can differ)
- **BM25 keyword relevance** (exact tokens like IDs, env vars, code symbols)
If full-text search is unavailable on your platform, Moltbot falls back to vector-only search.
#### Why hybrid?
Vector search is great at “this means the same thing”:
- “Mac Studio gateway host” vs “the machine running the gateway”
- “debounce file updates” vs “avoid indexing on every write”
But it can be weak at exact, high-signal tokens:
- IDs (`a828e60`, `b3b9895a…`)
- code symbols (`memorySearch.query.hybrid`)
- error strings (“sqlite-vec unavailable”)
BM25 (full-text) is the opposite: strong at exact tokens, weaker at paraphrases.
Hybrid search is the pragmatic middle ground: **use both retrieval signals** so you get
good results for both “natural language” queries and “needle in a haystack” queries.
#### How we merge results (the current design)
Implementation sketch:
1) Retrieve a candidate pool from both sides:
- **Vector**: top `maxResults * candidateMultiplier` by cosine similarity.
- **BM25**: top `maxResults * candidateMultiplier` by FTS5 BM25 rank (lower is better).
2) Convert BM25 rank into a 0..1-ish score:
- `textScore = 1 / (1 + max(0, bm25Rank))`
3) Union candidates by chunk id and compute a weighted score:
- `finalScore = vectorWeight * vectorScore + textWeight * textScore`
Notes:
- `vectorWeight` + `textWeight` is normalized to 1.0 in config resolution, so weights behave as percentages.
- If embeddings are unavailable (or the provider returns a zero-vector), we still run BM25 and return keyword matches.
- If FTS5 cant be created, we keep vector-only search (no hard failure).
This isnt “IR-theory perfect”, but its simple, fast, and tends to improve recall/precision on real notes.
If we want to get fancier later, common next steps are Reciprocal Rank Fusion (RRF) or score normalization
(min/max or z-score) before mixing.
Config:
```json5
agents: {
defaults: {
memorySearch: {
query: {
hybrid: {
enabled: true,
vectorWeight: 0.7,
textWeight: 0.3,
candidateMultiplier: 4
}
}
}
}
}
```
### Embedding cache
Moltbot can cache **chunk embeddings** in SQLite so reindexing and frequent updates (especially session transcripts) don't re-embed unchanged text.
Config:
```json5
agents: {
defaults: {
memorySearch: {
cache: {
enabled: true,
maxEntries: 50000
}
}
}
}
```
### Session memory search (experimental)
You can optionally index **session transcripts** and surface them via `memory_search`.
This is gated behind an experimental flag.
```json5
agents: {
defaults: {
memorySearch: {
experimental: { sessionMemory: true },
sources: ["memory", "sessions"]
}
}
}
```
Notes:
- Session indexing is **opt-in** (off by default).
- Session updates are debounced and **indexed asynchronously** once they cross delta thresholds (best-effort).
- `memory_search` never blocks on indexing; results can be slightly stale until background sync finishes.
- Results still include snippets only; `memory_get` remains limited to memory files.
- Session indexing is isolated per agent (only that agents session logs are indexed).
- Session logs live on disk (`~/.clawdbot/agents/<agentId>/sessions/*.jsonl`). Any process/user with filesystem access can read them, so treat disk access as the trust boundary. For stricter isolation, run agents under separate OS users or hosts.
Delta thresholds (defaults shown):
```json5
agents: {
defaults: {
memorySearch: {
sync: {
sessions: {
deltaBytes: 100000, // ~100 KB
deltaMessages: 50 // JSONL lines
}
}
}
}
}
```
### SQLite vector acceleration (sqlite-vec)
When the sqlite-vec extension is available, Moltbot stores embeddings in a
SQLite virtual table (`vec0`) and performs vector distance queries in the
database. This keeps search fast without loading every embedding into JS.
Configuration (optional):
```json5
agents: {
defaults: {
memorySearch: {
store: {
vector: {
enabled: true,
extensionPath: "/path/to/sqlite-vec"
}
}
}
}
}
```
Notes:
- `enabled` defaults to true; when disabled, search falls back to in-process
cosine similarity over stored embeddings.
- If the sqlite-vec extension is missing or fails to load, Moltbot logs the
error and continues with the JS fallback (no vector table).
- `extensionPath` overrides the bundled sqlite-vec path (useful for custom builds
or non-standard install locations).
### Local embedding auto-download
- Default local embedding model: `hf:ggml-org/embeddinggemma-300M-GGUF/embeddinggemma-300M-Q8_0.gguf` (~0.6 GB).
- When `memorySearch.provider = "local"`, `node-llama-cpp` resolves `modelPath`; if the GGUF is missing it **auto-downloads** to the cache (or `local.modelCacheDir` if set), then loads it. Downloads resume on retry.
- Native build requirement: run `pnpm approve-builds`, pick `node-llama-cpp`, then `pnpm rebuild node-llama-cpp`.
- Fallback: if local setup fails and `memorySearch.fallback = "openai"`, we automatically switch to remote embeddings (`openai/text-embedding-3-small` unless overridden) and record the reason.
### Custom OpenAI-compatible endpoint example
```json5
agents: {
defaults: {
memorySearch: {
provider: "openai",
model: "text-embedding-3-small",
remote: {
baseUrl: "https://api.example.com/v1/",
apiKey: "YOUR_REMOTE_API_KEY",
headers: {
"X-Organization": "org-id",
"X-Project": "project-id"
}
}
}
}
}
```
Notes:
- `remote.*` takes precedence over `models.providers.openai.*`.
- `remote.headers` merge with OpenAI headers; remote wins on key conflicts. Omit `remote.headers` to use the OpenAI defaults.

View File

@@ -0,0 +1,143 @@
---
summary: "Message flow, sessions, queueing, and reasoning visibility"
read_when:
- Explaining how inbound messages become replies
- Clarifying sessions, queueing modes, or streaming behavior
- Documenting reasoning visibility and usage implications
---
# Messages
This page ties together how Moltbot handles inbound messages, sessions, queueing,
streaming, and reasoning visibility.
## Message flow (high level)
```
Inbound message
-> routing/bindings -> session key
-> queue (if a run is active)
-> agent run (streaming + tools)
-> outbound replies (channel limits + chunking)
```
Key knobs live in configuration:
- `messages.*` for prefixes, queueing, and group behavior.
- `agents.defaults.*` for block streaming and chunking defaults.
- Channel overrides (`channels.whatsapp.*`, `channels.telegram.*`, etc.) for caps and streaming toggles.
See [Configuration](/gateway/configuration) for full schema.
## Inbound dedupe
Channels can redeliver the same message after reconnects. Moltbot keeps a
short-lived cache keyed by channel/account/peer/session/message id so duplicate
deliveries do not trigger another agent run.
## Inbound debouncing
Rapid consecutive messages from the **same sender** can be batched into a single
agent turn via `messages.inbound`. Debouncing is scoped per channel + conversation
and uses the most recent message for reply threading/IDs.
Config (global default + per-channel overrides):
```json5
{
messages: {
inbound: {
debounceMs: 2000,
byChannel: {
whatsapp: 5000,
slack: 1500,
discord: 1500
}
}
}
}
```
Notes:
- Debounce applies to **text-only** messages; media/attachments flush immediately.
- Control commands bypass debouncing so they remain standalone.
## Sessions and devices
Sessions are owned by the gateway, not by clients.
- Direct chats collapse into the agent main session key.
- Groups/channels get their own session keys.
- The session store and transcripts live on the gateway host.
Multiple devices/channels can map to the same session, but history is not fully
synced back to every client. Recommendation: use one primary device for long
conversations to avoid divergent context. The Control UI and TUI always show the
gateway-backed session transcript, so they are the source of truth.
Details: [Session management](/concepts/session).
## Inbound bodies and history context
Moltbot separates the **prompt body** from the **command body**:
- `Body`: prompt text sent to the agent. This may include channel envelopes and
optional history wrappers.
- `CommandBody`: raw user text for directive/command parsing.
- `RawBody`: legacy alias for `CommandBody` (kept for compatibility).
When a channel supplies history, it uses a shared wrapper:
- `[Chat messages since your last reply - for context]`
- `[Current message - respond to this]`
For **non-direct chats** (groups/channels/rooms), the **current message body** is prefixed with the
sender label (same style used for history entries). This keeps real-time and queued/history
messages consistent in the agent prompt.
History buffers are **pending-only**: they include group messages that did *not*
trigger a run (for example, mention-gated messages) and **exclude** messages
already in the session transcript.
Directive stripping only applies to the **current message** section so history
remains intact. Channels that wrap history should set `CommandBody` (or
`RawBody`) to the original message text and keep `Body` as the combined prompt.
History buffers are configurable via `messages.groupChat.historyLimit` (global
default) and per-channel overrides like `channels.slack.historyLimit` or
`channels.telegram.accounts.<id>.historyLimit` (set `0` to disable).
## Queueing and followups
If a run is already active, inbound messages can be queued, steered into the
current run, or collected for a followup turn.
- Configure via `messages.queue` (and `messages.queue.byChannel`).
- Modes: `interrupt`, `steer`, `followup`, `collect`, plus backlog variants.
Details: [Queueing](/concepts/queue).
## Streaming, chunking, and batching
Block streaming sends partial replies as the model produces text blocks.
Chunking respects channel text limits and avoids splitting fenced code.
Key settings:
- `agents.defaults.blockStreamingDefault` (`on|off`, default off)
- `agents.defaults.blockStreamingBreak` (`text_end|message_end`)
- `agents.defaults.blockStreamingChunk` (`minChars|maxChars|breakPreference`)
- `agents.defaults.blockStreamingCoalesce` (idle-based batching)
- `agents.defaults.humanDelay` (human-like pause between block replies)
- Channel overrides: `*.blockStreaming` and `*.blockStreamingCoalesce` (non-Telegram channels require explicit `*.blockStreaming: true`)
Details: [Streaming + chunking](/concepts/streaming).
## Reasoning visibility and tokens
Moltbot can expose or hide model reasoning:
- `/reasoning on|off|stream` controls visibility.
- Reasoning content still counts toward token usage when produced by the model.
- Telegram supports reasoning stream into the draft bubble.
Details: [Thinking + reasoning directives](/tools/thinking) and [Token use](/token-use).
## Prefixes, threading, and replies
Outbound message formatting is centralized in `messages`:
- `messages.responsePrefix` (outbound prefix) and `channels.whatsapp.messagePrefix` (WhatsApp inbound prefix)
- Reply threading via `replyToMode` and per-channel defaults
Details: [Configuration](/gateway/configuration#messages) and channel docs.

View File

@@ -0,0 +1,139 @@
---
summary: "How Moltbot rotates auth profiles and falls back across models"
read_when:
- Diagnosing auth profile rotation, cooldowns, or model fallback behavior
- Updating failover rules for auth profiles or models
---
# Model failover
Moltbot handles failures in two stages:
1) **Auth profile rotation** within the current provider.
2) **Model fallback** to the next model in `agents.defaults.model.fallbacks`.
This doc explains the runtime rules and the data that backs them.
## Auth storage (keys + OAuth)
Moltbot uses **auth profiles** for both API keys and OAuth tokens.
- Secrets live in `~/.clawdbot/agents/<agentId>/agent/auth-profiles.json` (legacy: `~/.clawdbot/agent/auth-profiles.json`).
- Config `auth.profiles` / `auth.order` are **metadata + routing only** (no secrets).
- Legacy import-only OAuth file: `~/.clawdbot/credentials/oauth.json` (imported into `auth-profiles.json` on first use).
More detail: [/concepts/oauth](/concepts/oauth)
Credential types:
- `type: "api_key"``{ provider, key }`
- `type: "oauth"``{ provider, access, refresh, expires, email? }` (+ `projectId`/`enterpriseUrl` for some providers)
## Profile IDs
OAuth logins create distinct profiles so multiple accounts can coexist.
- Default: `provider:default` when no email is available.
- OAuth with email: `provider:<email>` (for example `google-antigravity:user@gmail.com`).
Profiles live in `~/.clawdbot/agents/<agentId>/agent/auth-profiles.json` under `profiles`.
## Rotation order
When a provider has multiple profiles, Moltbot chooses an order like this:
1) **Explicit config**: `auth.order[provider]` (if set).
2) **Configured profiles**: `auth.profiles` filtered by provider.
3) **Stored profiles**: entries in `auth-profiles.json` for the provider.
If no explicit order is configured, Moltbot uses a roundrobin order:
- **Primary key:** profile type (**OAuth before API keys**).
- **Secondary key:** `usageStats.lastUsed` (oldest first, within each type).
- **Cooldown/disabled profiles** are moved to the end, ordered by soonest expiry.
### Session stickiness (cache-friendly)
Moltbot **pins the chosen auth profile per session** to keep provider caches warm.
It does **not** rotate on every request. The pinned profile is reused until:
- the session is reset (`/new` / `/reset`)
- a compaction completes (compaction count increments)
- the profile is in cooldown/disabled
Manual selection via `/model …@<profileId>` sets a **user override** for that session
and is not autorotated until a new session starts.
Autopinned profiles (selected by the session router) are treated as a **preference**:
they are tried first, but Moltbot may rotate to another profile on rate limits/timeouts.
Userpinned profiles stay locked to that profile; if it fails and model fallbacks
are configured, Moltbot moves to the next model instead of switching profiles.
### Why OAuth can “look lost”
If you have both an OAuth profile and an API key profile for the same provider, roundrobin can switch between them across messages unless pinned. To force a single profile:
- Pin with `auth.order[provider] = ["provider:profileId"]`, or
- Use a per-session override via `/model …` with a profile override (when supported by your UI/chat surface).
## Cooldowns
When a profile fails due to auth/ratelimit errors (or a timeout that looks
like rate limiting), Moltbot marks it in cooldown and moves to the next profile.
Format/invalidrequest errors (for example Cloud Code Assist tool call ID
validation failures) are treated as failoverworthy and use the same cooldowns.
Cooldowns use exponential backoff:
- 1 minute
- 5 minutes
- 25 minutes
- 1 hour (cap)
State is stored in `auth-profiles.json` under `usageStats`:
```json
{
"usageStats": {
"provider:profile": {
"lastUsed": 1736160000000,
"cooldownUntil": 1736160600000,
"errorCount": 2
}
}
}
```
## Billing disables
Billing/credit failures (for example “insufficient credits” / “credit balance too low”) are treated as failoverworthy, but theyre usually not transient. Instead of a short cooldown, Moltbot marks the profile as **disabled** (with a longer backoff) and rotates to the next profile/provider.
State is stored in `auth-profiles.json`:
```json
{
"usageStats": {
"provider:profile": {
"disabledUntil": 1736178000000,
"disabledReason": "billing"
}
}
}
```
Defaults:
- Billing backoff starts at **5 hours**, doubles per billing failure, and caps at **24 hours**.
- Backoff counters reset if the profile hasnt failed for **24 hours** (configurable).
## Model fallback
If all profiles for a provider fail, Moltbot moves to the next model in
`agents.defaults.model.fallbacks`. This applies to auth failures, rate limits, and
timeouts that exhausted profile rotation (other errors do not advance fallback).
When a run starts with a model override (hooks or CLI), fallbacks still end at
`agents.defaults.model.primary` after trying any configured fallbacks.
## Related config
See [Gateway configuration](/gateway/configuration) for:
- `auth.profiles` / `auth.order`
- `auth.cooldowns.billingBackoffHours` / `auth.cooldowns.billingBackoffHoursByProvider`
- `auth.cooldowns.billingMaxHours` / `auth.cooldowns.failureWindowHours`
- `agents.defaults.model.primary` / `agents.defaults.model.fallbacks`
- `agents.defaults.imageModel` routing
See [Models](/concepts/models) for the broader model selection and fallback overview.

View File

@@ -0,0 +1,318 @@
---
summary: "Model provider overview with example configs + CLI flows"
read_when:
- You need a provider-by-provider model setup reference
- You want example configs or CLI onboarding commands for model providers
---
# Model providers
This page covers **LLM/model providers** (not chat channels like WhatsApp/Telegram).
For model selection rules, see [/concepts/models](/concepts/models).
## Quick rules
- Model refs use `provider/model` (example: `opencode/claude-opus-4-5`).
- If you set `agents.defaults.models`, it becomes the allowlist.
- CLI helpers: `moltbot onboard`, `moltbot models list`, `moltbot models set <provider/model>`.
## Built-in providers (pi-ai catalog)
Moltbot ships with the piai catalog. These providers require **no**
`models.providers` config; just set auth + pick a model.
### OpenAI
- Provider: `openai`
- Auth: `OPENAI_API_KEY`
- Example model: `openai/gpt-5.2`
- CLI: `moltbot onboard --auth-choice openai-api-key`
```json5
{
agents: { defaults: { model: { primary: "openai/gpt-5.2" } } }
}
```
### Anthropic
- Provider: `anthropic`
- Auth: `ANTHROPIC_API_KEY` or `claude setup-token`
- Example model: `anthropic/claude-opus-4-5`
- CLI: `moltbot onboard --auth-choice token` (paste setup-token) or `moltbot models auth paste-token --provider anthropic`
```json5
{
agents: { defaults: { model: { primary: "anthropic/claude-opus-4-5" } } }
}
```
### OpenAI Code (Codex)
- Provider: `openai-codex`
- Auth: OAuth (ChatGPT)
- Example model: `openai-codex/gpt-5.2`
- CLI: `moltbot onboard --auth-choice openai-codex` or `moltbot models auth login --provider openai-codex`
```json5
{
agents: { defaults: { model: { primary: "openai-codex/gpt-5.2" } } }
}
```
### OpenCode Zen
- Provider: `opencode`
- Auth: `OPENCODE_API_KEY` (or `OPENCODE_ZEN_API_KEY`)
- Example model: `opencode/claude-opus-4-5`
- CLI: `moltbot onboard --auth-choice opencode-zen`
```json5
{
agents: { defaults: { model: { primary: "opencode/claude-opus-4-5" } } }
}
```
### Google Gemini (API key)
- Provider: `google`
- Auth: `GEMINI_API_KEY`
- Example model: `google/gemini-3-pro-preview`
- CLI: `moltbot onboard --auth-choice gemini-api-key`
### Google Vertex / Antigravity / Gemini CLI
- Providers: `google-vertex`, `google-antigravity`, `google-gemini-cli`
- Auth: Vertex uses gcloud ADC; Antigravity/Gemini CLI use their respective auth flows
- Antigravity OAuth is shipped as a bundled plugin (`google-antigravity-auth`, disabled by default).
- Enable: `moltbot plugins enable google-antigravity-auth`
- Login: `moltbot models auth login --provider google-antigravity --set-default`
- Gemini CLI OAuth is shipped as a bundled plugin (`google-gemini-cli-auth`, disabled by default).
- Enable: `moltbot plugins enable google-gemini-cli-auth`
- Login: `moltbot models auth login --provider google-gemini-cli --set-default`
- Note: you do **not** paste a client id or secret into `moltbot.json`. The CLI login flow stores
tokens in auth profiles on the gateway host.
### Z.AI (GLM)
- Provider: `zai`
- Auth: `ZAI_API_KEY`
- Example model: `zai/glm-4.7`
- CLI: `moltbot onboard --auth-choice zai-api-key`
- Aliases: `z.ai/*` and `z-ai/*` normalize to `zai/*`
### Vercel AI Gateway
- Provider: `vercel-ai-gateway`
- Auth: `AI_GATEWAY_API_KEY`
- Example model: `vercel-ai-gateway/anthropic/claude-opus-4.5`
- CLI: `moltbot onboard --auth-choice ai-gateway-api-key`
### Other built-in providers
- OpenRouter: `openrouter` (`OPENROUTER_API_KEY`)
- Example model: `openrouter/anthropic/claude-sonnet-4-5`
- xAI: `xai` (`XAI_API_KEY`)
- Groq: `groq` (`GROQ_API_KEY`)
- Cerebras: `cerebras` (`CEREBRAS_API_KEY`)
- GLM models on Cerebras use ids `zai-glm-4.7` and `zai-glm-4.6`.
- OpenAI-compatible base URL: `https://api.cerebras.ai/v1`.
- Mistral: `mistral` (`MISTRAL_API_KEY`)
- GitHub Copilot: `github-copilot` (`COPILOT_GITHUB_TOKEN` / `GH_TOKEN` / `GITHUB_TOKEN`)
## Providers via `models.providers` (custom/base URL)
Use `models.providers` (or `models.json`) to add **custom** providers or
OpenAI/Anthropiccompatible proxies.
### Moonshot AI (Kimi)
Moonshot uses OpenAI-compatible endpoints, so configure it as a custom provider:
- Provider: `moonshot`
- Auth: `MOONSHOT_API_KEY`
- Example model: `moonshot/kimi-k2-0905-preview`
- Kimi K2 model IDs:
{/* moonshot-kimi-k2-model-refs:start */}
- `moonshot/kimi-k2-0905-preview`
- `moonshot/kimi-k2-turbo-preview`
- `moonshot/kimi-k2-thinking`
- `moonshot/kimi-k2-thinking-turbo`
{/* moonshot-kimi-k2-model-refs:end */}
```json5
{
agents: {
defaults: { model: { primary: "moonshot/kimi-k2-0905-preview" } }
},
models: {
mode: "merge",
providers: {
moonshot: {
baseUrl: "https://api.moonshot.ai/v1",
apiKey: "${MOONSHOT_API_KEY}",
api: "openai-completions",
models: [{ id: "kimi-k2-0905-preview", name: "Kimi K2 0905 Preview" }]
}
}
}
}
```
### Kimi Code
Kimi Code uses a dedicated endpoint and key (separate from Moonshot):
- Provider: `kimi-code`
- Auth: `KIMICODE_API_KEY`
- Example model: `kimi-code/kimi-for-coding`
```json5
{
env: { KIMICODE_API_KEY: "sk-..." },
agents: {
defaults: { model: { primary: "kimi-code/kimi-for-coding" } }
},
models: {
mode: "merge",
providers: {
"kimi-code": {
baseUrl: "https://api.kimi.com/coding/v1",
apiKey: "${KIMICODE_API_KEY}",
api: "openai-completions",
models: [{ id: "kimi-for-coding", name: "Kimi For Coding" }]
}
}
}
}
```
### Qwen OAuth (free tier)
Qwen provides OAuth access to Qwen Coder + Vision via a device-code flow.
Enable the bundled plugin, then log in:
```bash
moltbot plugins enable qwen-portal-auth
moltbot models auth login --provider qwen-portal --set-default
```
Model refs:
- `qwen-portal/coder-model`
- `qwen-portal/vision-model`
See [/providers/qwen](/providers/qwen) for setup details and notes.
### Synthetic
Synthetic provides Anthropic-compatible models behind the `synthetic` provider:
- Provider: `synthetic`
- Auth: `SYNTHETIC_API_KEY`
- Example model: `synthetic/hf:MiniMaxAI/MiniMax-M2.1`
- CLI: `moltbot onboard --auth-choice synthetic-api-key`
```json5
{
agents: {
defaults: { model: { primary: "synthetic/hf:MiniMaxAI/MiniMax-M2.1" } }
},
models: {
mode: "merge",
providers: {
synthetic: {
baseUrl: "https://api.synthetic.new/anthropic",
apiKey: "${SYNTHETIC_API_KEY}",
api: "anthropic-messages",
models: [{ id: "hf:MiniMaxAI/MiniMax-M2.1", name: "MiniMax M2.1" }]
}
}
}
}
```
### MiniMax
MiniMax is configured via `models.providers` because it uses custom endpoints:
- MiniMax (Anthropiccompatible): `--auth-choice minimax-api`
- Auth: `MINIMAX_API_KEY`
See [/providers/minimax](/providers/minimax) for setup details, model options, and config snippets.
### Ollama
Ollama is a local LLM runtime that provides an OpenAI-compatible API:
- Provider: `ollama`
- Auth: None required (local server)
- Example model: `ollama/llama3.3`
- Installation: https://ollama.ai
```bash
# Install Ollama, then pull a model:
ollama pull llama3.3
```
```json5
{
agents: {
defaults: { model: { primary: "ollama/llama3.3" } }
}
}
```
Ollama is automatically detected when running locally at `http://127.0.0.1:11434/v1`. See [/providers/ollama](/providers/ollama) for model recommendations and custom configuration.
### Local proxies (LM Studio, vLLM, LiteLLM, etc.)
Example (OpenAIcompatible):
```json5
{
agents: {
defaults: {
model: { primary: "lmstudio/minimax-m2.1-gs32" },
models: { "lmstudio/minimax-m2.1-gs32": { alias: "Minimax" } }
}
},
models: {
providers: {
lmstudio: {
baseUrl: "http://localhost:1234/v1",
apiKey: "LMSTUDIO_KEY",
api: "openai-completions",
models: [
{
id: "minimax-m2.1-gs32",
name: "MiniMax M2.1",
reasoning: false,
input: ["text"],
cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
contextWindow: 200000,
maxTokens: 8192
}
]
}
}
}
}
```
Notes:
- For custom providers, `reasoning`, `input`, `cost`, `contextWindow`, and `maxTokens` are optional.
When omitted, Moltbot defaults to:
- `reasoning: false`
- `input: ["text"]`
- `cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 }`
- `contextWindow: 200000`
- `maxTokens: 8192`
- Recommended: set explicit values that match your proxy/model limits.
## CLI examples
```bash
moltbot onboard --auth-choice opencode-zen
moltbot models set opencode/claude-opus-4-5
moltbot models list
```
See also: [/gateway/configuration](/gateway/configuration) for full configuration examples.

View File

@@ -0,0 +1,202 @@
---
summary: "Models CLI: list, set, aliases, fallbacks, scan, status"
read_when:
- Adding or modifying models CLI (models list/set/scan/aliases/fallbacks)
- Changing model fallback behavior or selection UX
- Updating model scan probes (tools/images)
---
# Models CLI
See [/concepts/model-failover](/concepts/model-failover) for auth profile
rotation, cooldowns, and how that interacts with fallbacks.
Quick provider overview + examples: [/concepts/model-providers](/concepts/model-providers).
## How model selection works
Moltbot selects models in this order:
1) **Primary** model (`agents.defaults.model.primary` or `agents.defaults.model`).
2) **Fallbacks** in `agents.defaults.model.fallbacks` (in order).
3) **Provider auth failover** happens inside a provider before moving to the
next model.
Related:
- `agents.defaults.models` is the allowlist/catalog of models Moltbot can use (plus aliases).
- `agents.defaults.imageModel` is used **only when** the primary model cant accept images.
- Per-agent defaults can override `agents.defaults.model` via `agents.list[].model` plus bindings (see [/concepts/multi-agent](/concepts/multi-agent)).
## Quick model picks (anecdotal)
- **GLM**: a bit better for coding/tool calling.
- **MiniMax**: better for writing and vibes.
## Setup wizard (recommended)
If you dont want to hand-edit config, run the onboarding wizard:
```bash
moltbot onboard
```
It can set up model + auth for common providers, including **OpenAI Code (Codex)
subscription** (OAuth) and **Anthropic** (API key recommended; `claude
setup-token` also supported).
## Config keys (overview)
- `agents.defaults.model.primary` and `agents.defaults.model.fallbacks`
- `agents.defaults.imageModel.primary` and `agents.defaults.imageModel.fallbacks`
- `agents.defaults.models` (allowlist + aliases + provider params)
- `models.providers` (custom providers written into `models.json`)
Model refs are normalized to lowercase. Provider aliases like `z.ai/*` normalize
to `zai/*`.
Provider configuration examples (including OpenCode Zen) live in
[/gateway/configuration](/gateway/configuration#opencode-zen-multi-model-proxy).
## “Model is not allowed” (and why replies stop)
If `agents.defaults.models` is set, it becomes the **allowlist** for `/model` and for
session overrides. When a user selects a model that isnt in that allowlist,
Moltbot returns:
```
Model "provider/model" is not allowed. Use /model to list available models.
```
This happens **before** a normal reply is generated, so the message can feel
like it “didnt respond.” The fix is to either:
- Add the model to `agents.defaults.models`, or
- Clear the allowlist (remove `agents.defaults.models`), or
- Pick a model from `/model list`.
Example allowlist config:
```json5
{
agent: {
model: { primary: "anthropic/claude-sonnet-4-5" },
models: {
"anthropic/claude-sonnet-4-5": { alias: "Sonnet" },
"anthropic/claude-opus-4-5": { alias: "Opus" }
}
}
}
```
## Switching models in chat (`/model`)
You can switch models for the current session without restarting:
```
/model
/model list
/model 3
/model openai/gpt-5.2
/model status
```
Notes:
- `/model` (and `/model list`) is a compact, numbered picker (model family + available providers).
- `/model <#>` selects from that picker.
- `/model status` is the detailed view (auth candidates and, when configured, provider endpoint `baseUrl` + `api` mode).
- Model refs are parsed by splitting on the **first** `/`. Use `provider/model` when typing `/model <ref>`.
- If the model ID itself contains `/` (OpenRouter-style), you must include the provider prefix (example: `/model openrouter/moonshotai/kimi-k2`).
- If you omit the provider, Moltbot treats the input as an alias or a model for the **default provider** (only works when there is no `/` in the model ID).
Full command behavior/config: [Slash commands](/tools/slash-commands).
## CLI commands
```bash
moltbot models list
moltbot models status
moltbot models set <provider/model>
moltbot models set-image <provider/model>
moltbot models aliases list
moltbot models aliases add <alias> <provider/model>
moltbot models aliases remove <alias>
moltbot models fallbacks list
moltbot models fallbacks add <provider/model>
moltbot models fallbacks remove <provider/model>
moltbot models fallbacks clear
moltbot models image-fallbacks list
moltbot models image-fallbacks add <provider/model>
moltbot models image-fallbacks remove <provider/model>
moltbot models image-fallbacks clear
```
`moltbot models` (no subcommand) is a shortcut for `models status`.
### `models list`
Shows configured models by default. Useful flags:
- `--all`: full catalog
- `--local`: local providers only
- `--provider <name>`: filter by provider
- `--plain`: one model per line
- `--json`: machinereadable output
### `models status`
Shows the resolved primary model, fallbacks, image model, and an auth overview
of configured providers. It also surfaces OAuth expiry status for profiles found
in the auth store (warns within 24h by default). `--plain` prints only the
resolved primary model.
OAuth status is always shown (and included in `--json` output). If a configured
provider has no credentials, `models status` prints a **Missing auth** section.
JSON includes `auth.oauth` (warn window + profiles) and `auth.providers`
(effective auth per provider).
Use `--check` for automation (exit `1` when missing/expired, `2` when expiring).
Preferred Anthropic auth is the Claude Code CLI setup-token (run anywhere; paste on the gateway host if needed):
```bash
claude setup-token
moltbot models status
```
## Scanning (OpenRouter free models)
`moltbot models scan` inspects OpenRouters **free model catalog** and can
optionally probe models for tool and image support.
Key flags:
- `--no-probe`: skip live probes (metadata only)
- `--min-params <b>`: minimum parameter size (billions)
- `--max-age-days <days>`: skip older models
- `--provider <name>`: provider prefix filter
- `--max-candidates <n>`: fallback list size
- `--set-default`: set `agents.defaults.model.primary` to the first selection
- `--set-image`: set `agents.defaults.imageModel.primary` to the first image selection
Probing requires an OpenRouter API key (from auth profiles or
`OPENROUTER_API_KEY`). Without a key, use `--no-probe` to list candidates only.
Scan results are ranked by:
1) Image support
2) Tool latency
3) Context size
4) Parameter count
Input
- OpenRouter `/models` list (filter `:free`)
- Requires OpenRouter API key from auth profiles or `OPENROUTER_API_KEY` (see [/environment](/environment))
- Optional filters: `--max-age-days`, `--min-params`, `--provider`, `--max-candidates`
- Probe controls: `--timeout`, `--concurrency`
When run in a TTY, you can select fallbacks interactively. In noninteractive
mode, pass `--yes` to accept defaults.
## Models registry (`models.json`)
Custom providers in `models.providers` are written into `models.json` under the
agent directory (default `~/.clawdbot/agents/<agentId>/models.json`). This file
is merged by default unless `models.mode` is set to `replace`.

View File

@@ -0,0 +1,354 @@
---
summary: "Multi-agent routing: isolated agents, channel accounts, and bindings"
title: Multi-Agent Routing
read_when: "You want multiple isolated agents (workspaces + auth) in one gateway process."
status: active
---
# Multi-Agent Routing
Goal: multiple *isolated* agents (separate workspace + `agentDir` + sessions), plus multiple channel accounts (e.g. two WhatsApps) in one running Gateway. Inbound is routed to an agent via bindings.
## What is “one agent”?
An **agent** is a fully scoped brain with its own:
- **Workspace** (files, AGENTS.md/SOUL.md/USER.md, local notes, persona rules).
- **State directory** (`agentDir`) for auth profiles, model registry, and per-agent config.
- **Session store** (chat history + routing state) under `~/.clawdbot/agents/<agentId>/sessions`.
Auth profiles are **per-agent**. Each agent reads from its own:
```
~/.clawdbot/agents/<agentId>/agent/auth-profiles.json
```
Main agent credentials are **not** shared automatically. Never reuse `agentDir`
across agents (it causes auth/session collisions). If you want to share creds,
copy `auth-profiles.json` into the other agent's `agentDir`.
Skills are per-agent via each workspaces `skills/` folder, with shared skills
available from `~/.clawdbot/skills`. See [Skills: per-agent vs shared](/tools/skills#per-agent-vs-shared-skills).
The Gateway can host **one agent** (default) or **many agents** side-by-side.
**Workspace note:** each agents workspace is the **default cwd**, not a hard
sandbox. Relative paths resolve inside the workspace, but absolute paths can
reach other host locations unless sandboxing is enabled. See
[Sandboxing](/gateway/sandboxing).
## Paths (quick map)
- Config: `~/.clawdbot/moltbot.json` (or `CLAWDBOT_CONFIG_PATH`)
- State dir: `~/.clawdbot` (or `CLAWDBOT_STATE_DIR`)
- Workspace: `~/clawd` (or `~/clawd-<agentId>`)
- Agent dir: `~/.clawdbot/agents/<agentId>/agent` (or `agents.list[].agentDir`)
- Sessions: `~/.clawdbot/agents/<agentId>/sessions`
### Single-agent mode (default)
If you do nothing, Moltbot runs a single agent:
- `agentId` defaults to **`main`**.
- Sessions are keyed as `agent:main:<mainKey>`.
- Workspace defaults to `~/clawd` (or `~/clawd-<profile>` when `CLAWDBOT_PROFILE` is set).
- State defaults to `~/.clawdbot/agents/main/agent`.
## Agent helper
Use the agent wizard to add a new isolated agent:
```bash
moltbot agents add work
```
Then add `bindings` (or let the wizard do it) to route inbound messages.
Verify with:
```bash
moltbot agents list --bindings
```
## Multiple agents = multiple people, multiple personalities
With **multiple agents**, each `agentId` becomes a **fully isolated persona**:
- **Different phone numbers/accounts** (per channel `accountId`).
- **Different personalities** (per-agent workspace files like `AGENTS.md` and `SOUL.md`).
- **Separate auth + sessions** (no cross-talk unless explicitly enabled).
This lets **multiple people** share one Gateway server while keeping their AI “brains” and data isolated.
## One WhatsApp number, multiple people (DM split)
You can route **different WhatsApp DMs** to different agents while staying on **one WhatsApp account**. Match on sender E.164 (like `+15551234567`) with `peer.kind: "dm"`. Replies still come from the same WhatsApp number (no peragent sender identity).
Important detail: direct chats collapse to the agents **main session key**, so true isolation requires **one agent per person**.
Example:
```json5
{
agents: {
list: [
{ id: "alex", workspace: "~/clawd-alex" },
{ id: "mia", workspace: "~/clawd-mia" }
]
},
bindings: [
{ agentId: "alex", match: { channel: "whatsapp", peer: { kind: "dm", id: "+15551230001" } } },
{ agentId: "mia", match: { channel: "whatsapp", peer: { kind: "dm", id: "+15551230002" } } }
],
channels: {
whatsapp: {
dmPolicy: "allowlist",
allowFrom: ["+15551230001", "+15551230002"]
}
}
}
```
Notes:
- DM access control is **global per WhatsApp account** (pairing/allowlist), not per agent.
- For shared groups, bind the group to one agent or use [Broadcast groups](/broadcast-groups).
## Routing rules (how messages pick an agent)
Bindings are **deterministic** and **most-specific wins**:
1. `peer` match (exact DM/group/channel id)
2. `guildId` (Discord)
3. `teamId` (Slack)
4. `accountId` match for a channel
5. channel-level match (`accountId: "*"`)
6. fallback to default agent (`agents.list[].default`, else first list entry, default: `main`)
## Multiple accounts / phone numbers
Channels that support **multiple accounts** (e.g. WhatsApp) use `accountId` to identify
each login. Each `accountId` can be routed to a different agent, so one server can host
multiple phone numbers without mixing sessions.
## Concepts
- `agentId`: one “brain” (workspace, per-agent auth, per-agent session store).
- `accountId`: one channel account instance (e.g. WhatsApp account `"personal"` vs `"biz"`).
- `binding`: routes inbound messages to an `agentId` by `(channel, accountId, peer)` and optionally guild/team ids.
- Direct chats collapse to `agent:<agentId>:<mainKey>` (per-agent “main”; `session.mainKey`).
## Example: two WhatsApps → two agents
`~/.clawdbot/moltbot.json` (JSON5):
```js
{
agents: {
list: [
{
id: "home",
default: true,
name: "Home",
workspace: "~/clawd-home",
agentDir: "~/.clawdbot/agents/home/agent",
},
{
id: "work",
name: "Work",
workspace: "~/clawd-work",
agentDir: "~/.clawdbot/agents/work/agent",
},
],
},
// Deterministic routing: first match wins (most-specific first).
bindings: [
{ agentId: "home", match: { channel: "whatsapp", accountId: "personal" } },
{ agentId: "work", match: { channel: "whatsapp", accountId: "biz" } },
// Optional per-peer override (example: send a specific group to work agent).
{
agentId: "work",
match: {
channel: "whatsapp",
accountId: "personal",
peer: { kind: "group", id: "1203630...@g.us" },
},
},
],
// Off by default: agent-to-agent messaging must be explicitly enabled + allowlisted.
tools: {
agentToAgent: {
enabled: false,
allow: ["home", "work"],
},
},
channels: {
whatsapp: {
accounts: {
personal: {
// Optional override. Default: ~/.clawdbot/credentials/whatsapp/personal
// authDir: "~/.clawdbot/credentials/whatsapp/personal",
},
biz: {
// Optional override. Default: ~/.clawdbot/credentials/whatsapp/biz
// authDir: "~/.clawdbot/credentials/whatsapp/biz",
},
},
},
},
}
```
## Example: WhatsApp daily chat + Telegram deep work
Split by channel: route WhatsApp to a fast everyday agent and Telegram to an Opus agent.
```json5
{
agents: {
list: [
{
id: "chat",
name: "Everyday",
workspace: "~/clawd-chat",
model: "anthropic/claude-sonnet-4-5"
},
{
id: "opus",
name: "Deep Work",
workspace: "~/clawd-opus",
model: "anthropic/claude-opus-4-5"
}
]
},
bindings: [
{ agentId: "chat", match: { channel: "whatsapp" } },
{ agentId: "opus", match: { channel: "telegram" } }
]
}
```
Notes:
- If you have multiple accounts for a channel, add `accountId` to the binding (for example `{ channel: "whatsapp", accountId: "personal" }`).
- To route a single DM/group to Opus while keeping the rest on chat, add a `match.peer` binding for that peer; peer matches always win over channel-wide rules.
## Example: same channel, one peer to Opus
Keep WhatsApp on the fast agent, but route one DM to Opus:
```json5
{
agents: {
list: [
{ id: "chat", name: "Everyday", workspace: "~/clawd-chat", model: "anthropic/claude-sonnet-4-5" },
{ id: "opus", name: "Deep Work", workspace: "~/clawd-opus", model: "anthropic/claude-opus-4-5" }
]
},
bindings: [
{ agentId: "opus", match: { channel: "whatsapp", peer: { kind: "dm", id: "+15551234567" } } },
{ agentId: "chat", match: { channel: "whatsapp" } }
]
}
```
Peer bindings always win, so keep them above the channel-wide rule.
## Family agent bound to a WhatsApp group
Bind a dedicated family agent to a single WhatsApp group, with mention gating
and a tighter tool policy:
```json5
{
agents: {
list: [
{
id: "family",
name: "Family",
workspace: "~/clawd-family",
identity: { name: "Family Bot" },
groupChat: {
mentionPatterns: ["@family", "@familybot", "@Family Bot"]
},
sandbox: {
mode: "all",
scope: "agent"
},
tools: {
allow: ["exec", "read", "sessions_list", "sessions_history", "sessions_send", "sessions_spawn", "session_status"],
deny: ["write", "edit", "apply_patch", "browser", "canvas", "nodes", "cron"]
}
}
]
},
bindings: [
{
agentId: "family",
match: {
channel: "whatsapp",
peer: { kind: "group", id: "120363999999999999@g.us" }
}
}
]
}
```
Notes:
- Tool allow/deny lists are **tools**, not skills. If a skill needs to run a
binary, ensure `exec` is allowed and the binary exists in the sandbox.
- For stricter gating, set `agents.list[].groupChat.mentionPatterns` and keep
group allowlists enabled for the channel.
## Per-Agent Sandbox and Tool Configuration
Starting with v2026.1.6, each agent can have its own sandbox and tool restrictions:
```js
{
agents: {
list: [
{
id: "personal",
workspace: "~/clawd-personal",
sandbox: {
mode: "off", // No sandbox for personal agent
},
// No tool restrictions - all tools available
},
{
id: "family",
workspace: "~/clawd-family",
sandbox: {
mode: "all", // Always sandboxed
scope: "agent", // One container per agent
docker: {
// Optional one-time setup after container creation
setupCommand: "apt-get update && apt-get install -y git curl",
},
},
tools: {
allow: ["read"], // Only read tool
deny: ["exec", "write", "edit", "apply_patch"], // Deny others
},
},
],
},
}
```
Note: `setupCommand` lives under `sandbox.docker` and runs once on container creation.
Per-agent `sandbox.docker.*` overrides are ignored when the resolved scope is `"shared"`.
**Benefits:**
- **Security isolation**: Restrict tools for untrusted agents
- **Resource control**: Sandbox specific agents while keeping others on host
- **Flexible policies**: Different permissions per agent
Note: `tools.elevated` is **global** and sender-based; it is not configurable per agent.
If you need per-agent boundaries, use `agents.list[].tools` to deny `exec`.
For group targeting, use `agents.list[].groupChat.mentionPatterns` so @mentions map cleanly to the intended agent.
See [Multi-Agent Sandbox & Tools](/multi-agent-sandbox-tools) for detailed examples.

View File

@@ -0,0 +1,135 @@
---
summary: "OAuth in Moltbot: token exchange, storage, and multi-account patterns"
read_when:
- You want to understand Moltbot OAuth end-to-end
- You hit token invalidation / logout issues
- You want setup-token or OAuth auth flows
- You want multiple accounts or profile routing
---
# OAuth
Moltbot supports “subscription auth” via OAuth for providers that offer it (notably **OpenAI Codex (ChatGPT OAuth)**). For Anthropic subscriptions, use the **setup-token** flow. This page explains:
- how the OAuth **token exchange** works (PKCE)
- where tokens are **stored** (and why)
- how to handle **multiple accounts** (profiles + per-session overrides)
Moltbot also supports **provider plugins** that ship their own OAuth or APIkey
flows. Run them via:
```bash
moltbot models auth login --provider <id>
```
## The token sink (why it exists)
OAuth providers commonly mint a **new refresh token** during login/refresh flows. Some providers (or OAuth clients) can invalidate older refresh tokens when a new one is issued for the same user/app.
Practical symptom:
- you log in via Moltbot *and* via Claude Code / Codex CLI → one of them randomly gets “logged out” later
To reduce that, Moltbot treats `auth-profiles.json` as a **token sink**:
- the runtime reads credentials from **one place**
- we can keep multiple profiles and route them deterministically
## Storage (where tokens live)
Secrets are stored **per-agent**:
- Auth profiles (OAuth + API keys): `~/.clawdbot/agents/<agentId>/agent/auth-profiles.json`
- Runtime cache (managed automatically; dont edit): `~/.clawdbot/agents/<agentId>/agent/auth.json`
Legacy import-only file (still supported, but not the main store):
- `~/.clawdbot/credentials/oauth.json` (imported into `auth-profiles.json` on first use)
All of the above also respect `$CLAWDBOT_STATE_DIR` (state dir override). Full reference: [/gateway/configuration](/gateway/configuration#auth-storage-oauth--api-keys)
## Anthropic setup-token (subscription auth)
Run `claude setup-token` on any machine, then paste it into Moltbot:
```bash
moltbot models auth setup-token --provider anthropic
```
If you generated the token elsewhere, paste it manually:
```bash
moltbot models auth paste-token --provider anthropic
```
Verify:
```bash
moltbot models status
```
## OAuth exchange (how login works)
Moltbots interactive login flows are implemented in `@mariozechner/pi-ai` and wired into the wizards/commands.
### Anthropic (Claude Pro/Max) setup-token
Flow shape:
1) run `claude setup-token`
2) paste the token into Moltbot
3) store as a token auth profile (no refresh)
The wizard path is `moltbot onboard` → auth choice `setup-token` (Anthropic).
### OpenAI Codex (ChatGPT OAuth)
Flow shape (PKCE):
1) generate PKCE verifier/challenge + random `state`
2) open `https://auth.openai.com/oauth/authorize?...`
3) try to capture callback on `http://127.0.0.1:1455/auth/callback`
4) if callback cant bind (or youre remote/headless), paste the redirect URL/code
5) exchange at `https://auth.openai.com/oauth/token`
6) extract `accountId` from the access token and store `{ access, refresh, expires, accountId }`
Wizard path is `moltbot onboard` → auth choice `openai-codex`.
## Refresh + expiry
Profiles store an `expires` timestamp.
At runtime:
- if `expires` is in the future → use the stored access token
- if expired → refresh (under a file lock) and overwrite the stored credentials
The refresh flow is automatic; you generally don't need to manage tokens manually.
## Multiple accounts (profiles) + routing
Two patterns:
### 1) Preferred: separate agents
If you want “personal” and “work” to never interact, use isolated agents (separate sessions + credentials + workspace):
```bash
moltbot agents add work
moltbot agents add personal
```
Then configure auth per-agent (wizard) and route chats to the right agent.
### 2) Advanced: multiple profiles in one agent
`auth-profiles.json` supports multiple profile IDs for the same provider.
Pick which profile is used:
- globally via config ordering (`auth.order`)
- per-session via `/model ...@<profileId>`
Example (session override):
- `/model Opus@anthropic:work`
How to see what profile IDs exist:
- `moltbot channels list --json` (shows `auth[]`)
Related docs:
- [/concepts/model-failover](/concepts/model-failover) (rotation + cooldown rules)
- [/tools/slash-commands](/tools/slash-commands) (command surface)

View File

@@ -0,0 +1,98 @@
---
summary: "How Moltbot presence entries are produced, merged, and displayed"
read_when:
- Debugging the Instances tab
- Investigating duplicate or stale instance rows
- Changing gateway WS connect or system-event beacons
---
# Presence
Moltbot “presence” is a lightweight, besteffort view of:
- the **Gateway** itself, and
- **clients connected to the Gateway** (mac app, WebChat, CLI, etc.)
Presence is used primarily to render the macOS apps **Instances** tab and to
provide quick operator visibility.
## Presence fields (what shows up)
Presence entries are structured objects with fields like:
- `instanceId` (optional but strongly recommended): stable client identity (usually `connect.client.instanceId`)
- `host`: humanfriendly host name
- `ip`: besteffort IP address
- `version`: client version string
- `deviceFamily` / `modelIdentifier`: hardware hints
- `mode`: `ui`, `webchat`, `cli`, `backend`, `probe`, `test`, `node`, ...
- `lastInputSeconds`: “seconds since last user input” (if known)
- `reason`: `self`, `connect`, `node-connected`, `periodic`, ...
- `ts`: last update timestamp (ms since epoch)
## Producers (where presence comes from)
Presence entries are produced by multiple sources and **merged**.
### 1) Gateway self entry
The Gateway always seeds a “self” entry at startup so UIs show the gateway host
even before any clients connect.
### 2) WebSocket connect
Every WS client begins with a `connect` request. On successful handshake the
Gateway upserts a presence entry for that connection.
#### Why oneoff CLI commands dont show up
The CLI often connects for short, oneoff commands. To avoid spamming the
Instances list, `client.mode === "cli"` is **not** turned into a presence entry.
### 3) `system-event` beacons
Clients can send richer periodic beacons via the `system-event` method. The mac
app uses this to report host name, IP, and `lastInputSeconds`.
### 4) Node connects (role: node)
When a node connects over the Gateway WebSocket with `role: node`, the Gateway
upserts a presence entry for that node (same flow as other WS clients).
## Merge + dedupe rules (why `instanceId` matters)
Presence entries are stored in a single inmemory map:
- Entries are keyed by a **presence key**.
- The best key is a stable `instanceId` (from `connect.client.instanceId`) that survives restarts.
- Keys are caseinsensitive.
If a client reconnects without a stable `instanceId`, it may show up as a
**duplicate** row.
## TTL and bounded size
Presence is intentionally ephemeral:
- **TTL:** entries older than 5 minutes are pruned
- **Max entries:** 200 (oldest dropped first)
This keeps the list fresh and avoids unbounded memory growth.
## Remote/tunnel caveat (loopback IPs)
When a client connects over an SSH tunnel / local port forward, the Gateway may
see the remote address as `127.0.0.1`. To avoid overwriting a good clientreported
IP, loopback remote addresses are ignored.
## Consumers
### macOS Instances tab
The macOS app renders the output of `system-presence` and applies a small status
indicator (Active/Idle/Stale) based on the age of the last update.
## Debugging tips
- To see the raw list, call `system-presence` against the Gateway.
- If you see duplicates:
- confirm clients send a stable `client.instanceId` in the handshake
- confirm periodic beacons use the same `instanceId`
- check whether the connectionderived entry is missing `instanceId` (duplicates are expected)

View File

@@ -0,0 +1,77 @@
---
summary: "Command queue design that serializes inbound auto-reply runs"
read_when:
- Changing auto-reply execution or concurrency
---
# Command Queue (2026-01-16)
We serialize inbound auto-reply runs (all channels) through a tiny in-process queue to prevent multiple agent runs from colliding, while still allowing safe parallelism across sessions.
## Why
- Auto-reply runs can be expensive (LLM calls) and can collide when multiple inbound messages arrive close together.
- Serializing avoids competing for shared resources (session files, logs, CLI stdin) and reduces the chance of upstream rate limits.
## How it works
- A lane-aware FIFO queue drains each lane with a configurable concurrency cap (default 1 for unconfigured lanes; main defaults to 4, subagent to 8).
- `runEmbeddedPiAgent` enqueues by **session key** (lane `session:<key>`) to guarantee only one active run per session.
- Each session run is then queued into a **global lane** (`main` by default) so overall parallelism is capped by `agents.defaults.maxConcurrent`.
- When verbose logging is enabled, queued runs emit a short notice if they waited more than ~2s before starting.
- Typing indicators still fire immediately on enqueue (when supported by the channel) so user experience is unchanged while we wait our turn.
## Queue modes (per channel)
Inbound messages can steer the current run, wait for a followup turn, or do both:
- `steer`: inject immediately into the current run (cancels pending tool calls after the next tool boundary). If not streaming, falls back to followup.
- `followup`: enqueue for the next agent turn after the current run ends.
- `collect`: coalesce all queued messages into a **single** followup turn (default). If messages target different channels/threads, they drain individually to preserve routing.
- `steer-backlog` (aka `steer+backlog`): steer now **and** preserve the message for a followup turn.
- `interrupt` (legacy): abort the active run for that session, then run the newest message.
- `queue` (legacy alias): same as `steer`.
Steer-backlog means you can get a followup response after the steered run, so
streaming surfaces can look like duplicates. Prefer `collect`/`steer` if you want
one response per inbound message.
Send `/queue collect` as a standalone command (per-session) or set `messages.queue.byChannel.discord: "collect"`.
Defaults (when unset in config):
- All surfaces → `collect`
Configure globally or per channel via `messages.queue`:
```json5
{
messages: {
queue: {
mode: "collect",
debounceMs: 1000,
cap: 20,
drop: "summarize",
byChannel: { discord: "collect" }
}
}
}
```
## Queue options
Options apply to `followup`, `collect`, and `steer-backlog` (and to `steer` when it falls back to followup):
- `debounceMs`: wait for quiet before starting a followup turn (prevents “continue, continue”).
- `cap`: max queued messages per session.
- `drop`: overflow policy (`old`, `new`, `summarize`).
Summarize keeps a short bullet list of dropped messages and injects it as a synthetic followup prompt.
Defaults: `debounceMs: 1000`, `cap: 20`, `drop: summarize`.
## Per-session overrides
- Send `/queue <mode>` as a standalone command to store the mode for the current session.
- Options can be combined: `/queue collect debounce:2s cap:25 drop:summarize`
- `/queue default` or `/queue reset` clears the session override.
## Scope and guarantees
- Applies to auto-reply agent runs across all inbound channels that use the gateway reply pipeline (WhatsApp web, Telegram, Slack, Discord, Signal, iMessage, webchat, etc.).
- Default lane (`main`) is process-wide for inbound + main heartbeats; set `agents.defaults.maxConcurrent` to allow multiple sessions in parallel.
- Additional lanes may exist (e.g. `cron`, `subagent`) so background jobs can run in parallel without blocking inbound replies.
- Per-session lanes guarantee that only one agent run touches a given session at a time.
- No external dependencies or background worker threads; pure TypeScript + promises.
## Troubleshooting
- If commands seem stuck, enable verbose logs and look for “queued for …ms” lines to confirm the queue is draining.
- If you need queue depth, enable verbose logs and watch for queue timing lines.

View File

@@ -0,0 +1,60 @@
---
summary: "Retry policy for outbound provider calls"
read_when:
- Updating provider retry behavior or defaults
- Debugging provider send errors or rate limits
---
# Retry policy
## Goals
- Retry per HTTP request, not per multi-step flow.
- Preserve ordering by retrying only the current step.
- Avoid duplicating non-idempotent operations.
## Defaults
- Attempts: 3
- Max delay cap: 30000 ms
- Jitter: 0.1 (10 percent)
- Provider defaults:
- Telegram min delay: 400 ms
- Discord min delay: 500 ms
## Behavior
### Discord
- Retries only on rate-limit errors (HTTP 429).
- Uses Discord `retry_after` when available, otherwise exponential backoff.
### Telegram
- Retries on transient errors (429, timeout, connect/reset/closed, temporarily unavailable).
- Uses `retry_after` when available, otherwise exponential backoff.
- Markdown parse errors are not retried; they fall back to plain text.
## Configuration
Set retry policy per provider in `~/.clawdbot/moltbot.json`:
```json5
{
channels: {
telegram: {
retry: {
attempts: 3,
minDelayMs: 400,
maxDelayMs: 30000,
jitter: 0.1
}
},
discord: {
retry: {
attempts: 3,
minDelayMs: 500,
maxDelayMs: 30000,
jitter: 0.1
}
}
}
}
```
## Notes
- Retries apply per request (message send, media upload, reaction, poll, sticker).
- Composite flows do not retry completed steps.

View File

@@ -0,0 +1,104 @@
---
summary: "Session pruning: tool-result trimming to reduce context bloat"
read_when:
- You want to reduce LLM context growth from tool outputs
- You are tuning agents.defaults.contextPruning
---
# Session Pruning
Session pruning trims **old tool results** from the in-memory context right before each LLM call. It does **not** rewrite the on-disk session history (`*.jsonl`).
## When it runs
- When `mode: "cache-ttl"` is enabled and the last Anthropic call for the session is older than `ttl`.
- Only affects the messages sent to the model for that request.
- Only active for Anthropic API calls (and OpenRouter Anthropic models).
- For best results, match `ttl` to your model `cacheControlTtl`.
- After a prune, the TTL window resets so subsequent requests keep cache until `ttl` expires again.
## Smart defaults (Anthropic)
- **OAuth or setup-token** profiles: enable `cache-ttl` pruning and set heartbeat to `1h`.
- **API key** profiles: enable `cache-ttl` pruning, set heartbeat to `30m`, and default `cacheControlTtl` to `1h` on Anthropic models.
- If you set any of these values explicitly, Moltbot does **not** override them.
## What this improves (cost + cache behavior)
- **Why prune:** Anthropic prompt caching only applies within the TTL. If a session goes idle past the TTL, the next request re-caches the full prompt unless you trim it first.
- **What gets cheaper:** pruning reduces the **cacheWrite** size for that first request after the TTL expires.
- **Why the TTL reset matters:** once pruning runs, the cache window resets, so followup requests can reuse the freshly cached prompt instead of re-caching the full history again.
- **What it does not do:** pruning doesnt add tokens or “double” costs; it only changes what gets cached on that first postTTL request.
## What can be pruned
- Only `toolResult` messages.
- User + assistant messages are **never** modified.
- The last `keepLastAssistants` assistant messages are protected; tool results after that cutoff are not pruned.
- If there arent enough assistant messages to establish the cutoff, pruning is skipped.
- Tool results containing **image blocks** are skipped (never trimmed/cleared).
## Context window estimation
Pruning uses an estimated context window (chars ≈ tokens × 4). The window size is resolved in this order:
1) Model definition `contextWindow` (from the model registry).
2) `models.providers.*.models[].contextWindow` override.
3) `agents.defaults.contextTokens`.
4) Default `200000` tokens.
## Mode
### cache-ttl
- Pruning only runs if the last Anthropic call is older than `ttl` (default `5m`).
- When it runs: same soft-trim + hard-clear behavior as before.
## Soft vs hard pruning
- **Soft-trim**: only for oversized tool results.
- Keeps head + tail, inserts `...`, and appends a note with the original size.
- Skips results with image blocks.
- **Hard-clear**: replaces the entire tool result with `hardClear.placeholder`.
## Tool selection
- `tools.allow` / `tools.deny` support `*` wildcards.
- Deny wins.
- Matching is case-insensitive.
- Empty allow list => all tools allowed.
## Interaction with other limits
- Built-in tools already truncate their own output; session pruning is an extra layer that prevents long-running chats from accumulating too much tool output in the model context.
- Compaction is separate: compaction summarizes and persists, pruning is transient per request. See [/concepts/compaction](/concepts/compaction).
## Defaults (when enabled)
- `ttl`: `"5m"`
- `keepLastAssistants`: `3`
- `softTrimRatio`: `0.3`
- `hardClearRatio`: `0.5`
- `minPrunableToolChars`: `50000`
- `softTrim`: `{ maxChars: 4000, headChars: 1500, tailChars: 1500 }`
- `hardClear`: `{ enabled: true, placeholder: "[Old tool result content cleared]" }`
## Examples
Default (off):
```json5
{
agent: {
contextPruning: { mode: "off" }
}
}
```
Enable TTL-aware pruning:
```json5
{
agent: {
contextPruning: { mode: "cache-ttl", ttl: "5m" }
}
}
```
Restrict pruning to specific tools:
```json5
{
agent: {
contextPruning: {
mode: "cache-ttl",
tools: { allow: ["exec", "read"], deny: ["*image*"] }
}
}
}
```
See config reference: [Gateway Configuration](/gateway/configuration)

View File

@@ -0,0 +1,171 @@
---
summary: "Agent session tools for listing sessions, fetching history, and sending cross-session messages"
read_when:
- Adding or modifying session tools
---
# Session Tools
Goal: small, hard-to-misuse tool set so agents can list sessions, fetch history, and send to another session.
## Tool Names
- `sessions_list`
- `sessions_history`
- `sessions_send`
- `sessions_spawn`
## Key Model
- Main direct chat bucket is always the literal key `"main"` (resolved to the current agents main key).
- Group chats use `agent:<agentId>:<channel>:group:<id>` or `agent:<agentId>:<channel>:channel:<id>` (pass the full key).
- Cron jobs use `cron:<job.id>`.
- Hooks use `hook:<uuid>` unless explicitly set.
- Node sessions use `node-<nodeId>` unless explicitly set.
`global` and `unknown` are reserved values and are never listed. If `session.scope = "global"`, we alias it to `main` for all tools so callers never see `global`.
## sessions_list
List sessions as an array of rows.
Parameters:
- `kinds?: string[]` filter: any of `"main" | "group" | "cron" | "hook" | "node" | "other"`
- `limit?: number` max rows (default: server default, clamp e.g. 200)
- `activeMinutes?: number` only sessions updated within N minutes
- `messageLimit?: number` 0 = no messages (default 0); >0 = include last N messages
Behavior:
- `messageLimit > 0` fetches `chat.history` per session and includes the last N messages.
- Tool results are filtered out in list output; use `sessions_history` for tool messages.
- When running in a **sandboxed** agent session, session tools default to **spawned-only visibility** (see below).
Row shape (JSON):
- `key`: session key (string)
- `kind`: `main | group | cron | hook | node | other`
- `channel`: `whatsapp | telegram | discord | signal | imessage | webchat | internal | unknown`
- `displayName` (group display label if available)
- `updatedAt` (ms)
- `sessionId`
- `model`, `contextTokens`, `totalTokens`
- `thinkingLevel`, `verboseLevel`, `systemSent`, `abortedLastRun`
- `sendPolicy` (session override if set)
- `lastChannel`, `lastTo`
- `deliveryContext` (normalized `{ channel, to, accountId }` when available)
- `transcriptPath` (best-effort path derived from store dir + sessionId)
- `messages?` (only when `messageLimit > 0`)
## sessions_history
Fetch transcript for one session.
Parameters:
- `sessionKey` (required; accepts session key or `sessionId` from `sessions_list`)
- `limit?: number` max messages (server clamps)
- `includeTools?: boolean` (default false)
Behavior:
- `includeTools=false` filters `role: "toolResult"` messages.
- Returns messages array in the raw transcript format.
- When given a `sessionId`, Moltbot resolves it to the corresponding session key (missing ids error).
## sessions_send
Send a message into another session.
Parameters:
- `sessionKey` (required; accepts session key or `sessionId` from `sessions_list`)
- `message` (required)
- `timeoutSeconds?: number` (default >0; 0 = fire-and-forget)
Behavior:
- `timeoutSeconds = 0`: enqueue and return `{ runId, status: "accepted" }`.
- `timeoutSeconds > 0`: wait up to N seconds for completion, then return `{ runId, status: "ok", reply }`.
- If wait times out: `{ runId, status: "timeout", error }`. Run continues; call `sessions_history` later.
- If the run fails: `{ runId, status: "error", error }`.
- Announce delivery runs after the primary run completes and is best-effort; `status: "ok"` does not guarantee the announce was delivered.
- Waits via gateway `agent.wait` (server-side) so reconnects don't drop the wait.
- Agent-to-agent message context is injected for the primary run.
- After the primary run completes, Moltbot runs a **reply-back loop**:
- Round 2+ alternates between requester and target agents.
- Reply exactly `REPLY_SKIP` to stop the pingpong.
- Max turns is `session.agentToAgent.maxPingPongTurns` (05, default 5).
- Once the loop ends, Moltbot runs the **agenttoagent announce step** (target agent only):
- Reply exactly `ANNOUNCE_SKIP` to stay silent.
- Any other reply is sent to the target channel.
- Announce step includes the original request + round1 reply + latest pingpong reply.
## Channel Field
- For groups, `channel` is the channel recorded on the session entry.
- For direct chats, `channel` maps from `lastChannel`.
- For cron/hook/node, `channel` is `internal`.
- If missing, `channel` is `unknown`.
## Security / Send Policy
Policy-based blocking by channel/chat type (not per session id).
```json
{
"session": {
"sendPolicy": {
"rules": [
{
"match": { "channel": "discord", "chatType": "group" },
"action": "deny"
}
],
"default": "allow"
}
}
}
```
Runtime override (per session entry):
- `sendPolicy: "allow" | "deny"` (unset = inherit config)
- Settable via `sessions.patch` or owner-only `/send on|off|inherit` (standalone message).
Enforcement points:
- `chat.send` / `agent` (gateway)
- auto-reply delivery logic
## sessions_spawn
Spawn a sub-agent run in an isolated session and announce the result back to the requester chat channel.
Parameters:
- `task` (required)
- `label?` (optional; used for logs/UI)
- `agentId?` (optional; spawn under another agent id if allowed)
- `model?` (optional; overrides the sub-agent model; invalid values error)
- `runTimeoutSeconds?` (default 0; when set, aborts the sub-agent run after N seconds)
- `cleanup?` (`delete|keep`, default `keep`)
Allowlist:
- `agents.list[].subagents.allowAgents`: list of agent ids allowed via `agentId` (`["*"]` to allow any). Default: only the requester agent.
Discovery:
- Use `agents_list` to discover which agent ids are allowed for `sessions_spawn`.
Behavior:
- Starts a new `agent:<agentId>:subagent:<uuid>` session with `deliver: false`.
- Sub-agents default to the full tool set **minus session tools** (configurable via `tools.subagents.tools`).
- Sub-agents are not allowed to call `sessions_spawn` (no sub-agent → sub-agent spawning).
- Always non-blocking: returns `{ status: "accepted", runId, childSessionKey }` immediately.
- After completion, Moltbot runs a sub-agent **announce step** and posts the result to the requester chat channel.
- Reply exactly `ANNOUNCE_SKIP` during the announce step to stay silent.
- Announce replies are normalized to `Status`/`Result`/`Notes`; `Status` comes from runtime outcome (not model text).
- Sub-agent sessions are auto-archived after `agents.defaults.subagents.archiveAfterMinutes` (default: 60).
- Announce replies include a stats line (runtime, tokens, sessionKey/sessionId, transcript path, and optional cost).
## Sandbox Session Visibility
Sandboxed sessions can use session tools, but by default they only see sessions they spawned via `sessions_spawn`.
Config:
```json5
{
agents: {
defaults: {
sandbox: {
// default: "spawned"
sessionToolsVisibility: "spawned" // or "all"
}
}
}
}
```

View File

@@ -0,0 +1,150 @@
---
summary: "Session management rules, keys, and persistence for chats"
read_when:
- Modifying session handling or storage
---
# Session Management
Moltbot treats **one direct-chat session per agent** as primary. Direct chats collapse to `agent:<agentId>:<mainKey>` (default `main`), while group/channel chats get their own keys. `session.mainKey` is honored.
Use `session.dmScope` to control how **direct messages** are grouped:
- `main` (default): all DMs share the main session for continuity.
- `per-peer`: isolate by sender id across channels.
- `per-channel-peer`: isolate by channel + sender (recommended for multi-user inboxes).
Use `session.identityLinks` to map provider-prefixed peer ids to a canonical identity so the same person shares a DM session across channels when using `per-peer` or `per-channel-peer`.
## Gateway is the source of truth
All session state is **owned by the gateway** (the “master” Moltbot). UI clients (macOS app, WebChat, etc.) must query the gateway for session lists and token counts instead of reading local files.
- In **remote mode**, the session store you care about lives on the remote gateway host, not your Mac.
- Token counts shown in UIs come from the gateways store fields (`inputTokens`, `outputTokens`, `totalTokens`, `contextTokens`). Clients do not parse JSONL transcripts to “fix up” totals.
## Where state lives
- On the **gateway host**:
- Store file: `~/.clawdbot/agents/<agentId>/sessions/sessions.json` (per agent).
- Transcripts: `~/.clawdbot/agents/<agentId>/sessions/<SessionId>.jsonl` (Telegram topic sessions use `.../<SessionId>-topic-<threadId>.jsonl`).
- The store is a map `sessionKey -> { sessionId, updatedAt, ... }`. Deleting entries is safe; they are recreated on demand.
- Group entries may include `displayName`, `channel`, `subject`, `room`, and `space` to label sessions in UIs.
- Session entries include `origin` metadata (label + routing hints) so UIs can explain where a session came from.
- Moltbot does **not** read legacy Pi/Tau session folders.
## Session pruning
Moltbot trims **old tool results** from the in-memory context right before LLM calls by default.
This does **not** rewrite JSONL history. See [/concepts/session-pruning](/concepts/session-pruning).
## Pre-compaction memory flush
When a session nears auto-compaction, Moltbot can run a **silent memory flush**
turn that reminds the model to write durable notes to disk. This only runs when
the workspace is writable. See [Memory](/concepts/memory) and
[Compaction](/concepts/compaction).
## Mapping transports → session keys
- Direct chats follow `session.dmScope` (default `main`).
- `main`: `agent:<agentId>:<mainKey>` (continuity across devices/channels).
- Multiple phone numbers and channels can map to the same agent main key; they act as transports into one conversation.
- `per-peer`: `agent:<agentId>:dm:<peerId>`.
- `per-channel-peer`: `agent:<agentId>:<channel>:dm:<peerId>`.
- If `session.identityLinks` matches a provider-prefixed peer id (for example `telegram:123`), the canonical key replaces `<peerId>` so the same person shares a session across channels.
- Group chats isolate state: `agent:<agentId>:<channel>:group:<id>` (rooms/channels use `agent:<agentId>:<channel>:channel:<id>`).
- Telegram forum topics append `:topic:<threadId>` to the group id for isolation.
- Legacy `group:<id>` keys are still recognized for migration.
- Inbound contexts may still use `group:<id>`; the channel is inferred from `Provider` and normalized to the canonical `agent:<agentId>:<channel>:group:<id>` form.
- Other sources:
- Cron jobs: `cron:<job.id>`
- Webhooks: `hook:<uuid>` (unless explicitly set by the hook)
- Node runs: `node-<nodeId>`
## Lifecycle
- Reset policy: sessions are reused until they expire, and expiry is evaluated on the next inbound message.
- Daily reset: defaults to **4:00 AM local time on the gateway host**. A session is stale once its last update is earlier than the most recent daily reset time.
- Idle reset (optional): `idleMinutes` adds a sliding idle window. When both daily and idle resets are configured, **whichever expires first** forces a new session.
- Legacy idle-only: if you set `session.idleMinutes` without any `session.reset`/`resetByType` config, Moltbot stays in idle-only mode for backward compatibility.
- Per-type overrides (optional): `resetByType` lets you override the policy for `dm`, `group`, and `thread` sessions (thread = Slack/Discord threads, Telegram topics, Matrix threads when provided by the connector).
- Per-channel overrides (optional): `resetByChannel` overrides the reset policy for a channel (applies to all session types for that channel and takes precedence over `reset`/`resetByType`).
- Reset triggers: exact `/new` or `/reset` (plus any extras in `resetTriggers`) start a fresh session id and pass the remainder of the message through. `/new <model>` accepts a model alias, `provider/model`, or provider name (fuzzy match) to set the new session model. If `/new` or `/reset` is sent alone, Moltbot runs a short “hello” greeting turn to confirm the reset.
- Manual reset: delete specific keys from the store or remove the JSONL transcript; the next message recreates them.
- Isolated cron jobs always mint a fresh `sessionId` per run (no idle reuse).
## Send policy (optional)
Block delivery for specific session types without listing individual ids.
```json5
{
session: {
sendPolicy: {
rules: [
{ action: "deny", match: { channel: "discord", chatType: "group" } },
{ action: "deny", match: { keyPrefix: "cron:" } }
],
default: "allow"
}
}
}
```
Runtime override (owner only):
- `/send on` → allow for this session
- `/send off` → deny for this session
- `/send inherit` → clear override and use config rules
Send these as standalone messages so they register.
## Configuration (optional rename example)
```json5
// ~/.clawdbot/moltbot.json
{
session: {
scope: "per-sender", // keep group keys separate
dmScope: "main", // DM continuity (set per-channel-peer for shared inboxes)
identityLinks: {
alice: ["telegram:123456789", "discord:987654321012345678"]
},
reset: {
// Defaults: mode=daily, atHour=4 (gateway host local time).
// If you also set idleMinutes, whichever expires first wins.
mode: "daily",
atHour: 4,
idleMinutes: 120
},
resetByType: {
thread: { mode: "daily", atHour: 4 },
dm: { mode: "idle", idleMinutes: 240 },
group: { mode: "idle", idleMinutes: 120 }
},
resetByChannel: {
discord: { mode: "idle", idleMinutes: 10080 }
},
resetTriggers: ["/new", "/reset"],
store: "~/.clawdbot/agents/{agentId}/sessions/sessions.json",
mainKey: "main",
}
}
```
## Inspecting
- `moltbot status` — shows store path and recent sessions.
- `moltbot sessions --json` — dumps every entry (filter with `--active <minutes>`).
- `moltbot gateway call sessions.list --params '{}'` — fetch sessions from the running gateway (use `--url`/`--token` for remote gateway access).
- Send `/status` as a standalone message in chat to see whether the agent is reachable, how much of the session context is used, current thinking/verbose toggles, and when your WhatsApp web creds were last refreshed (helps spot relink needs).
- Send `/context list` or `/context detail` to see whats in the system prompt and injected workspace files (and the biggest context contributors).
- Send `/stop` as a standalone message to abort the current run, clear queued followups for that session, and stop any sub-agent runs spawned from it (the reply includes the stopped count).
- Send `/compact` (optional instructions) as a standalone message to summarize older context and free up window space. See [/concepts/compaction](/concepts/compaction).
- JSONL transcripts can be opened directly to review full turns.
## Tips
- Keep the primary key dedicated to 1:1 traffic; let groups keep their own keys.
- When automating cleanup, delete individual keys instead of the whole store to preserve context elsewhere.
## Session origin metadata
Each session entry records where it came from (best-effort) in `origin`:
- `label`: human label (resolved from conversation label + group subject/channel)
- `provider`: normalized channel id (including extensions)
- `from`/`to`: raw routing ids from the inbound envelope
- `accountId`: provider account id (when multi-account)
- `threadId`: thread/topic id when the channel supports it
The origin fields are populated for direct messages, channels, and groups. If a
connector only updates delivery routing (for example, to keep a DM main session
fresh), it should still provide inbound context so the session keeps its
explainer metadata. Extensions can do this by sending `ConversationLabel`,
`GroupSubject`, `GroupChannel`, `GroupSpace`, and `SenderName` in the inbound
context and calling `recordSessionMetaFromInbound` (or passing the same context
to `updateLastRoute`).

View File

@@ -0,0 +1,8 @@
---
summary: "Alias for session management docs"
read_when:
- You looked for docs/sessions.md; canonical doc lives in docs/session.md
---
# Sessions
Canonical session management docs live in [Session management](/concepts/session).

View File

@@ -0,0 +1,123 @@
---
summary: "Streaming + chunking behavior (block replies, draft streaming, limits)"
read_when:
- Explaining how streaming or chunking works on channels
- Changing block streaming or channel chunking behavior
- Debugging duplicate/early block replies or draft streaming
---
# Streaming + chunking
Moltbot has two separate “streaming” layers:
- **Block streaming (channels):** emit completed **blocks** as the assistant writes. These are normal channel messages (not token deltas).
- **Token-ish streaming (Telegram only):** update a **draft bubble** with partial text while generating; final message is sent at the end.
There is **no real token streaming** to external channel messages today. Telegram draft streaming is the only partial-stream surface.
## Block streaming (channel messages)
Block streaming sends assistant output in coarse chunks as it becomes available.
```
Model output
└─ text_delta/events
├─ (blockStreamingBreak=text_end)
│ └─ chunker emits blocks as buffer grows
└─ (blockStreamingBreak=message_end)
└─ chunker flushes at message_end
└─ channel send (block replies)
```
Legend:
- `text_delta/events`: model stream events (may be sparse for non-streaming models).
- `chunker`: `EmbeddedBlockChunker` applying min/max bounds + break preference.
- `channel send`: actual outbound messages (block replies).
**Controls:**
- `agents.defaults.blockStreamingDefault`: `"on"`/`"off"` (default off).
- Channel overrides: `*.blockStreaming` (and per-account variants) to force `"on"`/`"off"` per channel.
- `agents.defaults.blockStreamingBreak`: `"text_end"` or `"message_end"`.
- `agents.defaults.blockStreamingChunk`: `{ minChars, maxChars, breakPreference? }`.
- `agents.defaults.blockStreamingCoalesce`: `{ minChars?, maxChars?, idleMs? }` (merge streamed blocks before send).
- Channel hard cap: `*.textChunkLimit` (e.g., `channels.whatsapp.textChunkLimit`).
- Channel chunk mode: `*.chunkMode` (`length` default, `newline` splits on blank lines (paragraph boundaries) before length chunking).
- Discord soft cap: `channels.discord.maxLinesPerMessage` (default 17) splits tall replies to avoid UI clipping.
**Boundary semantics:**
- `text_end`: stream blocks as soon as chunker emits; flush on each `text_end`.
- `message_end`: wait until assistant message finishes, then flush buffered output.
`message_end` still uses the chunker if the buffered text exceeds `maxChars`, so it can emit multiple chunks at the end.
## Chunking algorithm (low/high bounds)
Block chunking is implemented by `EmbeddedBlockChunker`:
- **Low bound:** dont emit until buffer >= `minChars` (unless forced).
- **High bound:** prefer splits before `maxChars`; if forced, split at `maxChars`.
- **Break preference:** `paragraph``newline``sentence``whitespace` → hard break.
- **Code fences:** never split inside fences; when forced at `maxChars`, close + reopen the fence to keep Markdown valid.
`maxChars` is clamped to the channel `textChunkLimit`, so you cant exceed per-channel caps.
## Coalescing (merge streamed blocks)
When block streaming is enabled, Moltbot can **merge consecutive block chunks**
before sending them out. This reduces “single-line spam” while still providing
progressive output.
- Coalescing waits for **idle gaps** (`idleMs`) before flushing.
- Buffers are capped by `maxChars` and will flush if they exceed it.
- `minChars` prevents tiny fragments from sending until enough text accumulates
(final flush always sends remaining text).
- Joiner is derived from `blockStreamingChunk.breakPreference`
(`paragraph``\n\n`, `newline``\n`, `sentence` → space).
- Channel overrides are available via `*.blockStreamingCoalesce` (including per-account configs).
- Default coalesce `minChars` is bumped to 1500 for Signal/Slack/Discord unless overridden.
## Human-like pacing between blocks
When block streaming is enabled, you can add a **randomized pause** between
block replies (after the first block). This makes multi-bubble responses feel
more natural.
- Config: `agents.defaults.humanDelay` (override per agent via `agents.list[].humanDelay`).
- Modes: `off` (default), `natural` (8002500ms), `custom` (`minMs`/`maxMs`).
- Applies only to **block replies**, not final replies or tool summaries.
## “Stream chunks or everything”
This maps to:
- **Stream chunks:** `blockStreamingDefault: "on"` + `blockStreamingBreak: "text_end"` (emit as you go). Non-Telegram channels also need `*.blockStreaming: true`.
- **Stream everything at end:** `blockStreamingBreak: "message_end"` (flush once, possibly multiple chunks if very long).
- **No block streaming:** `blockStreamingDefault: "off"` (only final reply).
**Channel note:** For non-Telegram channels, block streaming is **off unless**
`*.blockStreaming` is explicitly set to `true`. Telegram can stream drafts
(`channels.telegram.streamMode`) without block replies.
Config location reminder: the `blockStreaming*` defaults live under
`agents.defaults`, not the root config.
## Telegram draft streaming (token-ish)
Telegram is the only channel with draft streaming:
- Uses Bot API `sendMessageDraft` in **private chats with topics**.
- `channels.telegram.streamMode: "partial" | "block" | "off"`.
- `partial`: draft updates with the latest stream text.
- `block`: draft updates in chunked blocks (same chunker rules).
- `off`: no draft streaming.
- Draft chunk config (only for `streamMode: "block"`): `channels.telegram.draftChunk` (defaults: `minChars: 200`, `maxChars: 800`).
- Draft streaming is separate from block streaming; block replies are off by default and only enabled by `*.blockStreaming: true` on non-Telegram channels.
- Final reply is still a normal message.
- `/reasoning stream` writes reasoning into the draft bubble (Telegram only).
When draft streaming is active, Moltbot disables block streaming for that reply to avoid double-streaming.
```
Telegram (private + topics)
└─ sendMessageDraft (draft bubble)
├─ streamMode=partial → update latest text
└─ streamMode=block → chunker updates draft
└─ final reply → normal message
```
Legend:
- `sendMessageDraft`: Telegram draft bubble (not a real message).
- `final reply`: normal Telegram message send.

View File

@@ -0,0 +1,110 @@
---
summary: "What the Moltbot system prompt contains and how it is assembled"
read_when:
- Editing system prompt text, tools list, or time/heartbeat sections
- Changing workspace bootstrap or skills injection behavior
---
# System Prompt
Moltbot builds a custom system prompt for every agent run. The prompt is **Moltbot-owned** and does not use the p-coding-agent default prompt.
The prompt is assembled by Moltbot and injected into each agent run.
## Structure
The prompt is intentionally compact and uses fixed sections:
- **Tooling**: current tool list + short descriptions.
- **Skills** (when available): tells the model how to load skill instructions on demand.
- **Moltbot Self-Update**: how to run `config.apply` and `update.run`.
- **Workspace**: working directory (`agents.defaults.workspace`).
- **Documentation**: local path to Moltbot docs (repo or npm package) and when to read them.
- **Workspace Files (injected)**: indicates bootstrap files are included below.
- **Sandbox** (when enabled): indicates sandboxed runtime, sandbox paths, and whether elevated exec is available.
- **Current Date & Time**: user-local time, timezone, and time format.
- **Reply Tags**: optional reply tag syntax for supported providers.
- **Heartbeats**: heartbeat prompt and ack behavior.
- **Runtime**: host, OS, node, model, repo root (when detected), thinking level (one line).
- **Reasoning**: current visibility level + /reasoning toggle hint.
## Prompt modes
Moltbot can render smaller system prompts for sub-agents. The runtime sets a
`promptMode` for each run (not a user-facing config):
- `full` (default): includes all sections above.
- `minimal`: used for sub-agents; omits **Skills**, **Memory Recall**, **Moltbot
Self-Update**, **Model Aliases**, **User Identity**, **Reply Tags**,
**Messaging**, **Silent Replies**, and **Heartbeats**. Tooling, Workspace,
Sandbox, Current Date & Time (when known), Runtime, and injected context stay
available.
- `none`: returns only the base identity line.
When `promptMode=minimal`, extra injected prompts are labeled **Subagent
Context** instead of **Group Chat Context**.
## Workspace bootstrap injection
Bootstrap files are trimmed and appended under **Project Context** so the model sees identity and profile context without needing explicit reads:
- `AGENTS.md`
- `SOUL.md`
- `TOOLS.md`
- `IDENTITY.md`
- `USER.md`
- `HEARTBEAT.md`
- `BOOTSTRAP.md` (only on brand-new workspaces)
Large files are truncated with a marker. The max per-file size is controlled by
`agents.defaults.bootstrapMaxChars` (default: 20000). Missing files inject a
short missing-file marker.
Internal hooks can intercept this step via `agent:bootstrap` to mutate or replace
the injected bootstrap files (for example swapping `SOUL.md` for an alternate persona).
To inspect how much each injected file contributes (raw vs injected, truncation, plus tool schema overhead), use `/context list` or `/context detail`. See [Context](/concepts/context).
## Time handling
The system prompt includes a dedicated **Current Date & Time** section when the
user timezone is known. To keep the prompt cache-stable, it now only includes
the **time zone** (no dynamic clock or time format).
Use `session_status` when the agent needs the current time; the status card
includes a timestamp line.
Configure with:
- `agents.defaults.userTimezone`
- `agents.defaults.timeFormat` (`auto` | `12` | `24`)
See [Date & Time](/date-time) for full behavior details.
## Skills
When eligible skills exist, Moltbot injects a compact **available skills list**
(`formatSkillsForPrompt`) that includes the **file path** for each skill. The
prompt instructs the model to use `read` to load the SKILL.md at the listed
location (workspace, managed, or bundled). If no skills are eligible, the
Skills section is omitted.
```
<available_skills>
<skill>
<name>...</name>
<description>...</description>
<location>...</location>
</skill>
</available_skills>
```
This keeps the base prompt small while still enabling targeted skill usage.
## Documentation
When available, the system prompt includes a **Documentation** section that points to the
local Moltbot docs directory (either `docs/` in the repo workspace or the bundled npm
package docs) and also notes the public mirror, source repo, community Discord, and
ClawdHub (https://clawdhub.com) for skills discovery. The prompt instructs the model to consult local docs first
for Moltbot behavior, commands, configuration, or architecture, and to run
`moltbot status` itself when possible (asking the user only when it lacks access).

View File

@@ -0,0 +1,89 @@
---
summary: "Timezone handling for agents, envelopes, and prompts"
read_when:
- You need to understand how timestamps are normalized for the model
- Configuring the user timezone for system prompts
---
# Timezones
Moltbot standardizes timestamps so the model sees a **single reference time**.
## Message envelopes (local by default)
Inbound messages are wrapped in an envelope like:
```
[Provider ... 2026-01-05 16:26 PST] message text
```
The timestamp in the envelope is **host-local by default**, with minutes precision.
You can override this with:
```json5
{
agents: {
defaults: {
envelopeTimezone: "local", // "utc" | "local" | "user" | IANA timezone
envelopeTimestamp: "on", // "on" | "off"
envelopeElapsed: "on" // "on" | "off"
}
}
}
```
- `envelopeTimezone: "utc"` uses UTC.
- `envelopeTimezone: "user"` uses `agents.defaults.userTimezone` (falls back to host timezone).
- Use an explicit IANA timezone (e.g., `"Europe/Vienna"`) for a fixed offset.
- `envelopeTimestamp: "off"` removes absolute timestamps from envelope headers.
- `envelopeElapsed: "off"` removes elapsed time suffixes (the `+2m` style).
### Examples
**Local (default):**
```
[Signal Alice +1555 2026-01-18 00:19 PST] hello
```
**Fixed timezone:**
```
[Signal Alice +1555 2026-01-18 06:19 GMT+1] hello
```
**Elapsed time:**
```
[Signal Alice +1555 +2m 2026-01-18T05:19Z] follow-up
```
## Tool payloads (raw provider data + normalized fields)
Tool calls (`channels.discord.readMessages`, `channels.slack.readMessages`, etc.) return **raw provider timestamps**.
We also attach normalized fields for consistency:
- `timestampMs` (UTC epoch milliseconds)
- `timestampUtc` (ISO 8601 UTC string)
Raw provider fields are preserved.
## User timezone for the system prompt
Set `agents.defaults.userTimezone` to tell the model the user's local time zone. If it is
unset, Moltbot resolves the **host timezone at runtime** (no config write).
```json5
{
agents: { defaults: { userTimezone: "America/Chicago" } }
}
```
The system prompt includes:
- `Current Date & Time` section with local time and timezone
- `Time format: 12-hour` or `24-hour`
You can control the prompt format with `agents.defaults.timeFormat` (`auto` | `12` | `24`).
See [Date & Time](/date-time) for the full behavior and examples.

View File

@@ -0,0 +1,281 @@
---
summary: "TypeBox schemas as the single source of truth for the gateway protocol"
read_when:
- Updating protocol schemas or codegen
---
# TypeBox as protocol source of truth
Last updated: 2026-01-10
TypeBox is a TypeScript-first schema library. We use it to define the **Gateway
WebSocket protocol** (handshake, request/response, server events). Those schemas
drive **runtime validation**, **JSON Schema export**, and **Swift codegen** for
the macOS app. One source of truth; everything else is generated.
If you want the higher-level protocol context, start with
[Gateway architecture](/concepts/architecture).
## Mental model (30 seconds)
Every Gateway WS message is one of three frames:
- **Request**: `{ type: "req", id, method, params }`
- **Response**: `{ type: "res", id, ok, payload | error }`
- **Event**: `{ type: "event", event, payload, seq?, stateVersion? }`
The first frame **must** be a `connect` request. After that, clients can call
methods (e.g. `health`, `send`, `chat.send`) and subscribe to events (e.g.
`presence`, `tick`, `agent`).
Connection flow (minimal):
```
Client Gateway
|---- req:connect -------->|
|<---- res:hello-ok --------|
|<---- event:tick ----------|
|---- req:health ---------->|
|<---- res:health ----------|
```
Common methods + events:
| Category | Examples | Notes |
| --- | --- | --- |
| Core | `connect`, `health`, `status` | `connect` must be first |
| Messaging | `send`, `poll`, `agent`, `agent.wait` | side-effects need `idempotencyKey` |
| Chat | `chat.history`, `chat.send`, `chat.abort`, `chat.inject` | WebChat uses these |
| Sessions | `sessions.list`, `sessions.patch`, `sessions.delete` | session admin |
| Nodes | `node.list`, `node.invoke`, `node.pair.*` | Gateway WS + node actions |
| Events | `tick`, `presence`, `agent`, `chat`, `health`, `shutdown` | server push |
Authoritative list lives in `src/gateway/server.ts` (`METHODS`, `EVENTS`).
## Where the schemas live
- Source: `src/gateway/protocol/schema.ts`
- Runtime validators (AJV): `src/gateway/protocol/index.ts`
- Server handshake + method dispatch: `src/gateway/server.ts`
- Node client: `src/gateway/client.ts`
- Generated JSON Schema: `dist/protocol.schema.json`
- Generated Swift models: `apps/macos/Sources/MoltbotProtocol/GatewayModels.swift`
## Current pipeline
- `pnpm protocol:gen`
- writes JSON Schema (draft07) to `dist/protocol.schema.json`
- `pnpm protocol:gen:swift`
- generates Swift gateway models
- `pnpm protocol:check`
- runs both generators and verifies the output is committed
## How the schemas are used at runtime
- **Server side**: every inbound frame is validated with AJV. The handshake only
accepts a `connect` request whose params match `ConnectParams`.
- **Client side**: the JS client validates event and response frames before
using them.
- **Method surface**: the Gateway advertises the supported `methods` and
`events` in `hello-ok`.
## Example frames
Connect (first message):
```json
{
"type": "req",
"id": "c1",
"method": "connect",
"params": {
"minProtocol": 2,
"maxProtocol": 2,
"client": {
"id": "moltbot-macos",
"displayName": "macos",
"version": "1.0.0",
"platform": "macos 15.1",
"mode": "ui",
"instanceId": "A1B2"
}
}
}
```
Hello-ok response:
```json
{
"type": "res",
"id": "c1",
"ok": true,
"payload": {
"type": "hello-ok",
"protocol": 2,
"server": { "version": "dev", "connId": "ws-1" },
"features": { "methods": ["health"], "events": ["tick"] },
"snapshot": { "presence": [], "health": {}, "stateVersion": { "presence": 0, "health": 0 }, "uptimeMs": 0 },
"policy": { "maxPayload": 1048576, "maxBufferedBytes": 1048576, "tickIntervalMs": 30000 }
}
}
```
Request + response:
```json
{ "type": "req", "id": "r1", "method": "health" }
```
```json
{ "type": "res", "id": "r1", "ok": true, "payload": { "ok": true } }
```
Event:
```json
{ "type": "event", "event": "tick", "payload": { "ts": 1730000000 }, "seq": 12 }
```
## Minimal client (Node.js)
Smallest useful flow: connect + health.
```ts
import { WebSocket } from "ws";
const ws = new WebSocket("ws://127.0.0.1:18789");
ws.on("open", () => {
ws.send(JSON.stringify({
type: "req",
id: "c1",
method: "connect",
params: {
minProtocol: 3,
maxProtocol: 3,
client: {
id: "cli",
displayName: "example",
version: "dev",
platform: "node",
mode: "cli"
}
}
}));
});
ws.on("message", (data) => {
const msg = JSON.parse(String(data));
if (msg.type === "res" && msg.id === "c1" && msg.ok) {
ws.send(JSON.stringify({ type: "req", id: "h1", method: "health" }));
}
if (msg.type === "res" && msg.id === "h1") {
console.log("health:", msg.payload);
ws.close();
}
});
```
## Worked example: add a method endtoend
Example: add a new `system.echo` request that returns `{ ok: true, text }`.
1) **Schema (source of truth)**
Add to `src/gateway/protocol/schema.ts`:
```ts
export const SystemEchoParamsSchema = Type.Object(
{ text: NonEmptyString },
{ additionalProperties: false },
);
export const SystemEchoResultSchema = Type.Object(
{ ok: Type.Boolean(), text: NonEmptyString },
{ additionalProperties: false },
);
```
Add both to `ProtocolSchemas` and export types:
```ts
SystemEchoParams: SystemEchoParamsSchema,
SystemEchoResult: SystemEchoResultSchema,
```
```ts
export type SystemEchoParams = Static<typeof SystemEchoParamsSchema>;
export type SystemEchoResult = Static<typeof SystemEchoResultSchema>;
```
2) **Validation**
In `src/gateway/protocol/index.ts`, export an AJV validator:
```ts
export const validateSystemEchoParams =
ajv.compile<SystemEchoParams>(SystemEchoParamsSchema);
```
3) **Server behavior**
Add a handler in `src/gateway/server-methods/system.ts`:
```ts
export const systemHandlers: GatewayRequestHandlers = {
"system.echo": ({ params, respond }) => {
const text = String(params.text ?? "");
respond(true, { ok: true, text });
},
};
```
Register it in `src/gateway/server-methods.ts` (already merges `systemHandlers`),
then add `"system.echo"` to `METHODS` in `src/gateway/server.ts`.
4) **Regenerate**
```bash
pnpm protocol:check
```
5) **Tests + docs**
Add a server test in `src/gateway/server.*.test.ts` and note the method in docs.
## Swift codegen behavior
The Swift generator emits:
- `GatewayFrame` enum with `req`, `res`, `event`, and `unknown` cases
- Strongly typed payload structs/enums
- `ErrorCode` values and `GATEWAY_PROTOCOL_VERSION`
Unknown frame types are preserved as raw payloads for forward compatibility.
## Versioning + compatibility
- `PROTOCOL_VERSION` lives in `src/gateway/protocol/schema.ts`.
- Clients send `minProtocol` + `maxProtocol`; the server rejects mismatches.
- The Swift models keep unknown frame types to avoid breaking older clients.
## Schema patterns and conventions
- Most objects use `additionalProperties: false` for strict payloads.
- `NonEmptyString` is the default for IDs and method/event names.
- The top-level `GatewayFrame` uses a **discriminator** on `type`.
- Methods with side effects usually require an `idempotencyKey` in params
(example: `send`, `poll`, `agent`, `chat.send`).
## Live schema JSON
Generated JSON Schema is in the repo at `dist/protocol.schema.json`. The
published raw file is typically available at:
- https://raw.githubusercontent.com/moltbot/moltbot/main/dist/protocol.schema.json
## When you change schemas
1) Update the TypeBox schemas.
2) Run `pnpm protocol:check`.
3) Commit the regenerated schema + Swift models.

View File

@@ -0,0 +1,59 @@
---
summary: "When Moltbot shows typing indicators and how to tune them"
read_when:
- Changing typing indicator behavior or defaults
---
# Typing indicators
Typing indicators are sent to the chat channel while a run is active. Use
`agents.defaults.typingMode` to control **when** typing starts and `typingIntervalSeconds`
to control **how often** it refreshes.
## Defaults
When `agents.defaults.typingMode` is **unset**, Moltbot keeps the legacy behavior:
- **Direct chats**: typing starts immediately once the model loop begins.
- **Group chats with a mention**: typing starts immediately.
- **Group chats without a mention**: typing starts only when message text begins streaming.
- **Heartbeat runs**: typing is disabled.
## Modes
Set `agents.defaults.typingMode` to one of:
- `never` — no typing indicator, ever.
- `instant` — start typing **as soon as the model loop begins**, even if the run
later returns only the silent reply token.
- `thinking` — start typing on the **first reasoning delta** (requires
`reasoningLevel: "stream"` for the run).
- `message` — start typing on the **first non-silent text delta** (ignores
the `NO_REPLY` silent token).
Order of “how early it fires”:
`never``message``thinking``instant`
## Configuration
```json5
{
agent: {
typingMode: "thinking",
typingIntervalSeconds: 6
}
}
```
You can override mode or cadence per session:
```json5
{
session: {
typingMode: "message",
typingIntervalSeconds: 4
}
}
```
## Notes
- `message` mode wont show typing for silent-only replies (e.g. the `NO_REPLY`
token used to suppress output).
- `thinking` only fires if the run streams reasoning (`reasoningLevel: "stream"`).
If the model doesnt emit reasoning deltas, typing wont start.
- Heartbeats never show typing, regardless of mode.
- `typingIntervalSeconds` controls the **refresh cadence**, not the start time.
The default is 6 seconds.

View File

@@ -0,0 +1,30 @@
---
summary: "Usage tracking surfaces and credential requirements"
read_when:
- You are wiring provider usage/quota surfaces
- You need to explain usage tracking behavior or auth requirements
---
# Usage tracking
## What it is
- Pulls provider usage/quota directly from their usage endpoints.
- No estimated costs; only the provider-reported windows.
## Where it shows up
- `/status` in chats: emojirich status card with session tokens + estimated cost (API key only). Provider usage shows for the **current model provider** when available.
- `/usage off|tokens|full` in chats: per-response usage footer (OAuth shows tokens only).
- `/usage cost` in chats: local cost summary aggregated from Moltbot session logs.
- CLI: `moltbot status --usage` prints a full per-provider breakdown.
- CLI: `moltbot channels list` prints the same usage snapshot alongside provider config (use `--no-usage` to skip).
- macOS menu bar: “Usage” section under Context (only if available).
## Providers + credentials
- **Anthropic (Claude)**: OAuth tokens in auth profiles.
- **GitHub Copilot**: OAuth tokens in auth profiles.
- **Gemini CLI**: OAuth tokens in auth profiles.
- **Antigravity**: OAuth tokens in auth profiles.
- **OpenAI Codex**: OAuth tokens in auth profiles (accountId used when present).
- **MiniMax**: API key (coding plan key; `MINIMAX_CODE_PLAN_KEY` or `MINIMAX_API_KEY`); uses the 5hour coding plan window.
- **z.ai**: API key via env/config/auth store.
Usage is hidden if no matching OAuth/API credentials exist.