Add ez-assistant and kerberos service folders

This commit is contained in:
kelin
2026-02-11 14:56:03 -05:00
parent e4e8ae1b87
commit 9ccfb36923
4471 changed files with 746463 additions and 0 deletions

View File

@@ -0,0 +1,51 @@
---
summary: "Direct `moltbot agent` CLI runs (with optional delivery)"
read_when:
- Adding or modifying the agent CLI entrypoint
---
# `moltbot agent` (direct agent runs)
`moltbot agent` runs a single agent turn without needing an inbound chat message.
By default it goes **through the Gateway**; add `--local` to force the embedded
runtime on the current machine.
## Behavior
- Required: `--message <text>`
- Session selection:
- `--to <dest>` derives the session key (group/channel targets preserve isolation; direct chats collapse to `main`), **or**
- `--session-id <id>` reuses an existing session by id, **or**
- `--agent <id>` targets a configured agent directly (uses that agent's `main` session key)
- Runs the same embedded agent runtime as normal inbound replies.
- Thinking/verbose flags persist into the session store.
- Output:
- default: prints reply text (plus `MEDIA:<url>` lines)
- `--json`: prints structured payload + metadata
- Optional delivery back to a channel with `--deliver` + `--channel` (target formats match `moltbot message --target`).
- Use `--reply-channel`/`--reply-to`/`--reply-account` to override delivery without changing the session.
If the Gateway is unreachable, the CLI **falls back** to the embedded local run.
## Examples
```bash
moltbot agent --to +15555550123 --message "status update"
moltbot agent --agent ops --message "Summarize logs"
moltbot agent --session-id 1234 --message "Summarize inbox" --thinking medium
moltbot agent --to +15555550123 --message "Trace logs" --verbose on --json
moltbot agent --to +15555550123 --message "Summon reply" --deliver
moltbot agent --agent ops --message "Generate report" --deliver --reply-channel slack --reply-to "#reports"
```
## Flags
- `--local`: run locally (requires model provider API keys in your shell)
- `--deliver`: send the reply to the chosen channel
- `--channel`: delivery channel (`whatsapp|telegram|discord|googlechat|slack|signal|imessage`, default: `whatsapp`)
- `--reply-to`: delivery target override
- `--reply-channel`: delivery channel override
- `--reply-account`: delivery account id override
- `--thinking <off|minimal|low|medium|high|xhigh>`: persist thinking level (GPT-5.2 + Codex models only)
- `--verbose <on|full|off>`: persist verbose level
- `--timeout <seconds>`: override agent timeout
- `--json`: output structured JSON

View File

@@ -0,0 +1,49 @@
---
summary: "Apply multi-file patches with the apply_patch tool"
read_when:
- You need structured file edits across multiple files
- You want to document or debug patch-based edits
---
# apply_patch tool
Apply file changes using a structured patch format. This is ideal for multi-file
or multi-hunk edits where a single `edit` call would be brittle.
The tool accepts a single `input` string that wraps one or more file operations:
```
*** Begin Patch
*** Add File: path/to/file.txt
+line 1
+line 2
*** Update File: src/app.ts
@@
-old line
+new line
*** Delete File: obsolete.txt
*** End Patch
```
## Parameters
- `input` (required): Full patch contents including `*** Begin Patch` and `*** End Patch`.
## Notes
- Paths are resolved relative to the workspace root.
- Use `*** Move to:` within an `*** Update File:` hunk to rename files.
- `*** End of File` marks an EOF-only insert when needed.
- Experimental and disabled by default. Enable with `tools.exec.applyPatch.enabled`.
- OpenAI-only (including OpenAI Codex). Optionally gate by model via
`tools.exec.applyPatch.allowModels`.
- Config is only under `tools.exec`.
## Example
```json
{
"tool": "apply_patch",
"input": "*** Begin Patch\n*** Update File: src/index.ts\n@@\n-const foo = 1\n+const foo = 2\n*** End Patch"
}
```

View File

@@ -0,0 +1,129 @@
---
summary: "Fix Chrome/Brave/Edge/Chromium CDP startup issues for Moltbot browser control on Linux"
read_when: "Browser control fails on Linux, especially with snap Chromium"
---
# Browser Troubleshooting (Linux)
## Problem: "Failed to start Chrome CDP on port 18800"
Moltbot's browser control server fails to launch Chrome/Brave/Edge/Chromium with the error:
```
{"error":"Error: Failed to start Chrome CDP on port 18800 for profile \"clawd\"."}
```
### Root Cause
On Ubuntu (and many Linux distros), the default Chromium installation is a **snap package**. Snap's AppArmor confinement interferes with how Moltbot spawns and monitors the browser process.
The `apt install chromium` command installs a stub package that redirects to snap:
```
Note, selecting 'chromium-browser' instead of 'chromium'
chromium-browser is already the newest version (2:1snap1-0ubuntu2).
```
This is NOT a real browser — it's just a wrapper.
### Solution 1: Install Google Chrome (Recommended)
Install the official Google Chrome `.deb` package, which is not sandboxed by snap:
```bash
wget https://dl.google.com/linux/direct/google-chrome-stable_current_amd64.deb
sudo dpkg -i google-chrome-stable_current_amd64.deb
sudo apt --fix-broken install -y # if there are dependency errors
```
Then update your Moltbot config (`~/.clawdbot/moltbot.json`):
```json
{
"browser": {
"enabled": true,
"executablePath": "/usr/bin/google-chrome-stable",
"headless": true,
"noSandbox": true
}
}
```
### Solution 2: Use Snap Chromium with Attach-Only Mode
If you must use snap Chromium, configure Moltbot to attach to a manually-started browser:
1. Update config:
```json
{
"browser": {
"enabled": true,
"attachOnly": true,
"headless": true,
"noSandbox": true
}
}
```
2. Start Chromium manually:
```bash
chromium-browser --headless --no-sandbox --disable-gpu \
--remote-debugging-port=18800 \
--user-data-dir=$HOME/.clawdbot/browser/clawd/user-data \
about:blank &
```
3. Optionally create a systemd user service to auto-start Chrome:
```ini
# ~/.config/systemd/user/clawd-browser.service
[Unit]
Description=Clawd Browser (Chrome CDP)
After=network.target
[Service]
ExecStart=/snap/bin/chromium --headless --no-sandbox --disable-gpu --remote-debugging-port=18800 --user-data-dir=%h/.clawdbot/browser/clawd/user-data about:blank
Restart=on-failure
RestartSec=5
[Install]
WantedBy=default.target
```
Enable with: `systemctl --user enable --now clawd-browser.service`
### Verifying the Browser Works
Check status:
```bash
curl -s http://127.0.0.1:18791/ | jq '{running, pid, chosenBrowser}'
```
Test browsing:
```bash
curl -s -X POST http://127.0.0.1:18791/start
curl -s http://127.0.0.1:18791/tabs
```
### Config Reference
| Option | Description | Default |
|--------|-------------|---------|
| `browser.enabled` | Enable browser control | `true` |
| `browser.executablePath` | Path to a Chromium-based browser binary (Chrome/Brave/Edge/Chromium) | auto-detected (prefers default browser when Chromium-based) |
| `browser.headless` | Run without GUI | `false` |
| `browser.noSandbox` | Add `--no-sandbox` flag (needed for some Linux setups) | `false` |
| `browser.attachOnly` | Don't launch browser, only attach to existing | `false` |
| `browser.cdpPort` | Chrome DevTools Protocol port | `18800` |
### Problem: "Chrome extension relay is running, but no tab is connected"
Youre using the `chrome` profile (extension relay). It expects the Moltbot
browser extension to be attached to a live tab.
Fix options:
1. **Use the managed browser:** `moltbot browser start --browser-profile clawd`
(or set `browser.defaultProfile: "clawd"`).
2. **Use the extension relay:** install the extension, open a tab, and click the
Moltbot extension icon to attach it.
Notes:
- The `chrome` profile uses your **system default Chromium browser** when possible.
- Local `clawd` profiles auto-assign `cdpPort`/`cdpUrl`; only set those for remote CDP.

View File

@@ -0,0 +1,67 @@
---
summary: "Manual logins for browser automation + X/Twitter posting"
read_when:
- You need to log into sites for browser automation
- You want to post updates to X/Twitter
---
# Browser login + X/Twitter posting
## Manual login (recommended)
When a site requires login, **sign in manually** in the **host** browser profile (the clawd browser).
Do **not** give the model your credentials. Automated logins often trigger antibot defenses and can lock the account.
Back to the main browser docs: [Browser](/tools/browser).
## Which Chrome profile is used?
Moltbot controls a **dedicated Chrome profile** (named `clawd`, orangetinted UI). This is separate from your daily browser profile.
Two easy ways to access it:
1) **Ask the agent to open the browser** and then log in yourself.
2) **Open it via CLI**:
```bash
moltbot browser start
moltbot browser open https://x.com
```
If you have multiple profiles, pass `--browser-profile <name>` (the default is `clawd`).
## X/Twitter: recommended flow
- **Read/search/threads:** use the **bird** CLI skill (no browser, stable).
- Repo: https://github.com/steipete/bird
- **Post updates:** use the **host** browser (manual login).
## Sandboxing + host browser access
Sandboxed browser sessions are **more likely** to trigger bot detection. For X/Twitter (and other strict sites), prefer the **host** browser.
If the agent is sandboxed, the browser tool defaults to the sandbox. To allow host control:
```json5
{
agents: {
defaults: {
sandbox: {
mode: "non-main",
browser: {
allowHostControl: true
}
}
}
}
}
```
Then target the host browser:
```bash
moltbot browser open https://x.com --browser-profile clawd --target host
```
Or disable sandboxing for the agent that posts updates.

View File

@@ -0,0 +1,536 @@
---
summary: "Integrated browser control service + action commands"
read_when:
- Adding agent-controlled browser automation
- Debugging why clawd is interfering with your own Chrome
- Implementing browser settings + lifecycle in the macOS app
---
# Browser (clawd-managed)
Moltbot can run a **dedicated Chrome/Brave/Edge/Chromium profile** that the agent controls.
It is isolated from your personal browser and is managed through a small local
control service inside the Gateway (loopback only).
Beginner view:
- Think of it as a **separate, agent-only browser**.
- The `clawd` profile does **not** touch your personal browser profile.
- The agent can **open tabs, read pages, click, and type** in a safe lane.
- The default `chrome` profile uses the **system default Chromium browser** via the
extension relay; switch to `clawd` for the isolated managed browser.
## What you get
- A separate browser profile named **clawd** (orange accent by default).
- Deterministic tab control (list/open/focus/close).
- Agent actions (click/type/drag/select), snapshots, screenshots, PDFs.
- Optional multi-profile support (`clawd`, `work`, `remote`, ...).
This browser is **not** your daily driver. It is a safe, isolated surface for
agent automation and verification.
## Quick start
```bash
moltbot browser --browser-profile clawd status
moltbot browser --browser-profile clawd start
moltbot browser --browser-profile clawd open https://example.com
moltbot browser --browser-profile clawd snapshot
```
If you get “Browser disabled”, enable it in config (see below) and restart the
Gateway.
## Profiles: `clawd` vs `chrome`
- `clawd`: managed, isolated browser (no extension required).
- `chrome`: extension relay to your **system browser** (requires the Moltbot
extension to be attached to a tab).
Set `browser.defaultProfile: "clawd"` if you want managed mode by default.
## Configuration
Browser settings live in `~/.clawdbot/moltbot.json`.
```json5
{
browser: {
enabled: true, // default: true
// cdpUrl: "http://127.0.0.1:18792", // legacy single-profile override
remoteCdpTimeoutMs: 1500, // remote CDP HTTP timeout (ms)
remoteCdpHandshakeTimeoutMs: 3000, // remote CDP WebSocket handshake timeout (ms)
defaultProfile: "chrome",
color: "#FF4500",
headless: false,
noSandbox: false,
attachOnly: false,
executablePath: "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser",
profiles: {
clawd: { cdpPort: 18800, color: "#FF4500" },
work: { cdpPort: 18801, color: "#0066CC" },
remote: { cdpUrl: "http://10.0.0.42:9222", color: "#00AA00" }
}
}
}
```
Notes:
- The browser control service binds to loopback on a port derived from `gateway.port`
(default: `18791`, which is gateway + 2). The relay uses the next port (`18792`).
- If you override the Gateway port (`gateway.port` or `CLAWDBOT_GATEWAY_PORT`),
the derived browser ports shift to stay in the same “family”.
- `cdpUrl` defaults to the relay port when unset.
- `remoteCdpTimeoutMs` applies to remote (non-loopback) CDP reachability checks.
- `remoteCdpHandshakeTimeoutMs` applies to remote CDP WebSocket reachability checks.
- `attachOnly: true` means “never launch a local browser; only attach if it is already running.”
- `color` + per-profile `color` tint the browser UI so you can see which profile is active.
- Default profile is `chrome` (extension relay). Use `defaultProfile: "clawd"` for the managed browser.
- Auto-detect order: system default browser if Chromium-based; otherwise Chrome → Brave → Edge → Chromium → Chrome Canary.
- Local `clawd` profiles auto-assign `cdpPort`/`cdpUrl` — set those only for remote CDP.
## Use Brave (or another Chromium-based browser)
If your **system default** browser is Chromium-based (Chrome/Brave/Edge/etc),
Moltbot uses it automatically. Set `browser.executablePath` to override
auto-detection:
CLI example:
```bash
moltbot config set browser.executablePath "/usr/bin/google-chrome"
```
```json5
// macOS
{
browser: {
executablePath: "/Applications/Brave Browser.app/Contents/MacOS/Brave Browser"
}
}
// Windows
{
browser: {
executablePath: "C:\\Program Files\\BraveSoftware\\Brave-Browser\\Application\\brave.exe"
}
}
// Linux
{
browser: {
executablePath: "/usr/bin/brave-browser"
}
}
```
## Local vs remote control
- **Local control (default):** the Gateway starts the loopback control service and can launch a local browser.
- **Remote control (node host):** run a node host on the machine that has the browser; the Gateway proxies browser actions to it.
- **Remote CDP:** set `browser.profiles.<name>.cdpUrl` (or `browser.cdpUrl`) to
attach to a remote Chromium-based browser. In this case, Moltbot will not launch a local browser.
Remote CDP URLs can include auth:
- Query tokens (e.g., `https://provider.example?token=<token>`)
- HTTP Basic auth (e.g., `https://user:pass@provider.example`)
Moltbot preserves the auth when calling `/json/*` endpoints and when connecting
to the CDP WebSocket. Prefer environment variables or secrets managers for
tokens instead of committing them to config files.
## Node browser proxy (zero-config default)
If you run a **node host** on the machine that has your browser, Moltbot can
auto-route browser tool calls to that node without any extra browser config.
This is the default path for remote gateways.
Notes:
- The node host exposes its local browser control server via a **proxy command**.
- Profiles come from the nodes own `browser.profiles` config (same as local).
- Disable if you dont want it:
- On the node: `nodeHost.browserProxy.enabled=false`
- On the gateway: `gateway.nodes.browser.mode="off"`
## Browserless (hosted remote CDP)
[Browserless](https://browserless.io) is a hosted Chromium service that exposes
CDP endpoints over HTTPS. You can point a Moltbot browser profile at a
Browserless region endpoint and authenticate with your API key.
Example:
```json5
{
browser: {
enabled: true,
defaultProfile: "browserless",
remoteCdpTimeoutMs: 2000,
remoteCdpHandshakeTimeoutMs: 4000,
profiles: {
browserless: {
cdpUrl: "https://production-sfo.browserless.io?token=<BROWSERLESS_API_KEY>",
color: "#00AA00"
}
}
}
}
```
Notes:
- Replace `<BROWSERLESS_API_KEY>` with your real Browserless token.
- Choose the region endpoint that matches your Browserless account (see their docs).
## Security
Key ideas:
- Browser control is loopback-only; access flows through the Gateways auth or node pairing.
- Keep the Gateway and any node hosts on a private network (Tailscale); avoid public exposure.
- Treat remote CDP URLs/tokens as secrets; prefer env vars or a secrets manager.
Remote CDP tips:
- Prefer HTTPS endpoints and short-lived tokens where possible.
- Avoid embedding long-lived tokens directly in config files.
## Profiles (multi-browser)
Moltbot supports multiple named profiles (routing configs). Profiles can be:
- **clawd-managed**: a dedicated Chromium-based browser instance with its own user data directory + CDP port
- **remote**: an explicit CDP URL (Chromium-based browser running elsewhere)
- **extension relay**: your existing Chrome tab(s) via the local relay + Chrome extension
Defaults:
- The `clawd` profile is auto-created if missing.
- The `chrome` profile is built-in for the Chrome extension relay (points at `http://127.0.0.1:18792` by default).
- Local CDP ports allocate from **1880018899** by default.
- Deleting a profile moves its local data directory to Trash.
All control endpoints accept `?profile=<name>`; the CLI uses `--browser-profile`.
## Chrome extension relay (use your existing Chrome)
Moltbot can also drive **your existing Chrome tabs** (no separate “clawd” Chrome instance) via a local CDP relay + a Chrome extension.
Full guide: [Chrome extension](/tools/chrome-extension)
Flow:
- The Gateway runs locally (same machine) or a node host runs on the browser machine.
- A local **relay server** listens at a loopback `cdpUrl` (default: `http://127.0.0.1:18792`).
- You click the **Moltbot Browser Relay** extension icon on a tab to attach (it does not auto-attach).
- The agent controls that tab via the normal `browser` tool, by selecting the right profile.
If the Gateway runs elsewhere, run a node host on the browser machine so the Gateway can proxy browser actions.
### Sandboxed sessions
If the agent session is sandboxed, the `browser` tool may default to `target="sandbox"` (sandbox browser).
Chrome extension relay takeover requires host browser control, so either:
- run the session unsandboxed, or
- set `agents.defaults.sandbox.browser.allowHostControl: true` and use `target="host"` when calling the tool.
### Setup
1) Load the extension (dev/unpacked):
```bash
moltbot browser extension install
```
- Chrome → `chrome://extensions` → enable “Developer mode”
- “Load unpacked” → select the directory printed by `moltbot browser extension path`
- Pin the extension, then click it on the tab you want to control (badge shows `ON`).
2) Use it:
- CLI: `moltbot browser --browser-profile chrome tabs`
- Agent tool: `browser` with `profile="chrome"`
Optional: if you want a different name or relay port, create your own profile:
```bash
moltbot browser create-profile \
--name my-chrome \
--driver extension \
--cdp-url http://127.0.0.1:18792 \
--color "#00AA00"
```
Notes:
- This mode relies on Playwright-on-CDP for most operations (screenshots/snapshots/actions).
- Detach by clicking the extension icon again.
## Isolation guarantees
- **Dedicated user data dir**: never touches your personal browser profile.
- **Dedicated ports**: avoids `9222` to prevent collisions with dev workflows.
- **Deterministic tab control**: target tabs by `targetId`, not “last tab”.
## Browser selection
When launching locally, Moltbot picks the first available:
1. Chrome
2. Brave
3. Edge
4. Chromium
5. Chrome Canary
You can override with `browser.executablePath`.
Platforms:
- macOS: checks `/Applications` and `~/Applications`.
- Linux: looks for `google-chrome`, `brave`, `microsoft-edge`, `chromium`, etc.
- Windows: checks common install locations.
## Control API (optional)
For local integrations only, the Gateway exposes a small loopback HTTP API:
- Status/start/stop: `GET /`, `POST /start`, `POST /stop`
- Tabs: `GET /tabs`, `POST /tabs/open`, `POST /tabs/focus`, `DELETE /tabs/:targetId`
- Snapshot/screenshot: `GET /snapshot`, `POST /screenshot`
- Actions: `POST /navigate`, `POST /act`
- Hooks: `POST /hooks/file-chooser`, `POST /hooks/dialog`
- Downloads: `POST /download`, `POST /wait/download`
- Debugging: `GET /console`, `POST /pdf`
- Debugging: `GET /errors`, `GET /requests`, `POST /trace/start`, `POST /trace/stop`, `POST /highlight`
- Network: `POST /response/body`
- State: `GET /cookies`, `POST /cookies/set`, `POST /cookies/clear`
- State: `GET /storage/:kind`, `POST /storage/:kind/set`, `POST /storage/:kind/clear`
- Settings: `POST /set/offline`, `POST /set/headers`, `POST /set/credentials`, `POST /set/geolocation`, `POST /set/media`, `POST /set/timezone`, `POST /set/locale`, `POST /set/device`
All endpoints accept `?profile=<name>`.
### Playwright requirement
Some features (navigate/act/AI snapshot/role snapshot, element screenshots, PDF) require
Playwright. If Playwright isnt installed, those endpoints return a clear 501
error. ARIA snapshots and basic screenshots still work for clawd-managed Chrome.
For the Chrome extension relay driver, ARIA snapshots and screenshots require Playwright.
If you see `Playwright is not available in this gateway build`, install the full
Playwright package (not `playwright-core`) and restart the gateway, or reinstall
Moltbot with browser support.
## How it works (internal)
High-level flow:
- A small **control server** accepts HTTP requests.
- It connects to Chromium-based browsers (Chrome/Brave/Edge/Chromium) via **CDP**.
- For advanced actions (click/type/snapshot/PDF), it uses **Playwright** on top
of CDP.
- When Playwright is missing, only non-Playwright operations are available.
This design keeps the agent on a stable, deterministic interface while letting
you swap local/remote browsers and profiles.
## CLI quick reference
All commands accept `--browser-profile <name>` to target a specific profile.
All commands also accept `--json` for machine-readable output (stable payloads).
Basics:
- `moltbot browser status`
- `moltbot browser start`
- `moltbot browser stop`
- `moltbot browser tabs`
- `moltbot browser tab`
- `moltbot browser tab new`
- `moltbot browser tab select 2`
- `moltbot browser tab close 2`
- `moltbot browser open https://example.com`
- `moltbot browser focus abcd1234`
- `moltbot browser close abcd1234`
Inspection:
- `moltbot browser screenshot`
- `moltbot browser screenshot --full-page`
- `moltbot browser screenshot --ref 12`
- `moltbot browser screenshot --ref e12`
- `moltbot browser snapshot`
- `moltbot browser snapshot --format aria --limit 200`
- `moltbot browser snapshot --interactive --compact --depth 6`
- `moltbot browser snapshot --efficient`
- `moltbot browser snapshot --labels`
- `moltbot browser snapshot --selector "#main" --interactive`
- `moltbot browser snapshot --frame "iframe#main" --interactive`
- `moltbot browser console --level error`
- `moltbot browser errors --clear`
- `moltbot browser requests --filter api --clear`
- `moltbot browser pdf`
- `moltbot browser responsebody "**/api" --max-chars 5000`
Actions:
- `moltbot browser navigate https://example.com`
- `moltbot browser resize 1280 720`
- `moltbot browser click 12 --double`
- `moltbot browser click e12 --double`
- `moltbot browser type 23 "hello" --submit`
- `moltbot browser press Enter`
- `moltbot browser hover 44`
- `moltbot browser scrollintoview e12`
- `moltbot browser drag 10 11`
- `moltbot browser select 9 OptionA OptionB`
- `moltbot browser download e12 /tmp/report.pdf`
- `moltbot browser waitfordownload /tmp/report.pdf`
- `moltbot browser upload /tmp/file.pdf`
- `moltbot browser fill --fields '[{"ref":"1","type":"text","value":"Ada"}]'`
- `moltbot browser dialog --accept`
- `moltbot browser wait --text "Done"`
- `moltbot browser wait "#main" --url "**/dash" --load networkidle --fn "window.ready===true"`
- `moltbot browser evaluate --fn '(el) => el.textContent' --ref 7`
- `moltbot browser highlight e12`
- `moltbot browser trace start`
- `moltbot browser trace stop`
State:
- `moltbot browser cookies`
- `moltbot browser cookies set session abc123 --url "https://example.com"`
- `moltbot browser cookies clear`
- `moltbot browser storage local get`
- `moltbot browser storage local set theme dark`
- `moltbot browser storage session clear`
- `moltbot browser set offline on`
- `moltbot browser set headers --json '{"X-Debug":"1"}'`
- `moltbot browser set credentials user pass`
- `moltbot browser set credentials --clear`
- `moltbot browser set geo 37.7749 -122.4194 --origin "https://example.com"`
- `moltbot browser set geo --clear`
- `moltbot browser set media dark`
- `moltbot browser set timezone America/New_York`
- `moltbot browser set locale en-US`
- `moltbot browser set device "iPhone 14"`
Notes:
- `upload` and `dialog` are **arming** calls; run them before the click/press
that triggers the chooser/dialog.
- `upload` can also set file inputs directly via `--input-ref` or `--element`.
- `snapshot`:
- `--format ai` (default when Playwright is installed): returns an AI snapshot with numeric refs (`aria-ref="<n>"`).
- `--format aria`: returns the accessibility tree (no refs; inspection only).
- `--efficient` (or `--mode efficient`): compact role snapshot preset (interactive + compact + depth + lower maxChars).
- Config default (tool/CLI only): set `browser.snapshotDefaults.mode: "efficient"` to use efficient snapshots when the caller does not pass a mode (see [Gateway configuration](/gateway/configuration#browser-clawd-managed-browser)).
- Role snapshot options (`--interactive`, `--compact`, `--depth`, `--selector`) force a role-based snapshot with refs like `ref=e12`.
- `--frame "<iframe selector>"` scopes role snapshots to an iframe (pairs with role refs like `e12`).
- `--interactive` outputs a flat, easy-to-pick list of interactive elements (best for driving actions).
- `--labels` adds a viewport-only screenshot with overlayed ref labels (prints `MEDIA:<path>`).
- `click`/`type`/etc require a `ref` from `snapshot` (either numeric `12` or role ref `e12`).
CSS selectors are intentionally not supported for actions.
## Snapshots and refs
Moltbot supports two “snapshot” styles:
- **AI snapshot (numeric refs)**: `moltbot browser snapshot` (default; `--format ai`)
- Output: a text snapshot that includes numeric refs.
- Actions: `moltbot browser click 12`, `moltbot browser type 23 "hello"`.
- Internally, the ref is resolved via Playwrights `aria-ref`.
- **Role snapshot (role refs like `e12`)**: `moltbot browser snapshot --interactive` (or `--compact`, `--depth`, `--selector`, `--frame`)
- Output: a role-based list/tree with `[ref=e12]` (and optional `[nth=1]`).
- Actions: `moltbot browser click e12`, `moltbot browser highlight e12`.
- Internally, the ref is resolved via `getByRole(...)` (plus `nth()` for duplicates).
- Add `--labels` to include a viewport screenshot with overlayed `e12` labels.
Ref behavior:
- Refs are **not stable across navigations**; if something fails, re-run `snapshot` and use a fresh ref.
- If the role snapshot was taken with `--frame`, role refs are scoped to that iframe until the next role snapshot.
## Wait power-ups
You can wait on more than just time/text:
- Wait for URL (globs supported by Playwright):
- `moltbot browser wait --url "**/dash"`
- Wait for load state:
- `moltbot browser wait --load networkidle`
- Wait for a JS predicate:
- `moltbot browser wait --fn "window.ready===true"`
- Wait for a selector to become visible:
- `moltbot browser wait "#main"`
These can be combined:
```bash
moltbot browser wait "#main" \
--url "**/dash" \
--load networkidle \
--fn "window.ready===true" \
--timeout-ms 15000
```
## Debug workflows
When an action fails (e.g. “not visible”, “strict mode violation”, “covered”):
1. `moltbot browser snapshot --interactive`
2. Use `click <ref>` / `type <ref>` (prefer role refs in interactive mode)
3. If it still fails: `moltbot browser highlight <ref>` to see what Playwright is targeting
4. If the page behaves oddly:
- `moltbot browser errors --clear`
- `moltbot browser requests --filter api --clear`
5. For deep debugging: record a trace:
- `moltbot browser trace start`
- reproduce the issue
- `moltbot browser trace stop` (prints `TRACE:<path>`)
## JSON output
`--json` is for scripting and structured tooling.
Examples:
```bash
moltbot browser status --json
moltbot browser snapshot --interactive --json
moltbot browser requests --filter api --json
moltbot browser cookies --json
```
Role snapshots in JSON include `refs` plus a small `stats` block (lines/chars/refs/interactive) so tools can reason about payload size and density.
## State and environment knobs
These are useful for “make the site behave like X” workflows:
- Cookies: `cookies`, `cookies set`, `cookies clear`
- Storage: `storage local|session get|set|clear`
- Offline: `set offline on|off`
- Headers: `set headers --json '{"X-Debug":"1"}'` (or `--clear`)
- HTTP basic auth: `set credentials user pass` (or `--clear`)
- Geolocation: `set geo <lat> <lon> --origin "https://example.com"` (or `--clear`)
- Media: `set media dark|light|no-preference|none`
- Timezone / locale: `set timezone ...`, `set locale ...`
- Device / viewport:
- `set device "iPhone 14"` (Playwright device presets)
- `set viewport 1280 720`
## Security & privacy
- The clawd browser profile may contain logged-in sessions; treat it as sensitive.
- `browser act kind=evaluate` / `moltbot browser evaluate` and `wait --fn`
execute arbitrary JavaScript in the page context. Prompt injection can steer
this. Disable it with `browser.evaluateEnabled=false` if you do not need it.
- For logins and anti-bot notes (X/Twitter, etc.), see [Browser login + X/Twitter posting](/tools/browser-login).
- Keep the Gateway/node host private (loopback or tailnet-only).
- Remote CDP endpoints are powerful; tunnel and protect them.
## Troubleshooting
For Linux-specific issues (especially snap Chromium), see
[Browser troubleshooting](/tools/browser-linux-troubleshooting).
## Agent tools + how control works
The agent gets **one tool** for browser automation:
- `browser` — status/start/stop/tabs/open/focus/close/snapshot/screenshot/navigate/act
How it maps:
- `browser snapshot` returns a stable UI tree (AI or ARIA).
- `browser act` uses the snapshot `ref` IDs to click/type/drag/select.
- `browser screenshot` captures pixels (full page or element).
- `browser` accepts:
- `profile` to choose a named browser profile (clawd, chrome, or remote CDP).
- `target` (`sandbox` | `host` | `node`) to select where the browser lives.
- In sandboxed sessions, `target: "host"` requires `agents.defaults.sandbox.browser.allowHostControl=true`.
- If `target` is omitted: sandboxed sessions default to `sandbox`, non-sandbox sessions default to `host`.
- If a browser-capable node is connected, the tool may auto-route to it unless you pin `target="host"` or `target="node"`.
This keeps the agent deterministic and avoids brittle selectors.

View File

@@ -0,0 +1,168 @@
---
summary: "Chrome extension: let Moltbot drive your existing Chrome tab"
read_when:
- You want the agent to drive an existing Chrome tab (toolbar button)
- You need remote Gateway + local browser automation via Tailscale
- You want to understand the security implications of browser takeover
---
# Chrome extension (browser relay)
The Moltbot Chrome extension lets the agent control your **existing Chrome tabs** (your normal Chrome window) instead of launching a separate clawd-managed Chrome profile.
Attach/detach happens via a **single Chrome toolbar button**.
## What it is (concept)
There are three parts:
- **Browser control service** (Gateway or node): the API the agent/tool calls (via the Gateway)
- **Local relay server** (loopback CDP): bridges between the control server and the extension (`http://127.0.0.1:18792` by default)
- **Chrome MV3 extension**: attaches to the active tab using `chrome.debugger` and pipes CDP messages to the relay
Moltbot then controls the attached tab through the normal `browser` tool surface (selecting the right profile).
## Install / load (unpacked)
1) Install the extension to a stable local path:
```bash
moltbot browser extension install
```
2) Print the installed extension directory path:
```bash
moltbot browser extension path
```
3) Chrome → `chrome://extensions`
- Enable “Developer mode”
- “Load unpacked” → select the directory printed above
4) Pin the extension.
## Updates (no build step)
The extension ships inside the Moltbot release (npm package) as static files. There is no separate “build” step.
After upgrading Moltbot:
- Re-run `moltbot browser extension install` to refresh the installed files under your Moltbot state directory.
- Chrome → `chrome://extensions` → click “Reload” on the extension.
## Use it (no extra config)
Moltbot ships with a built-in browser profile named `chrome` that targets the extension relay on the default port.
Use it:
- CLI: `moltbot browser --browser-profile chrome tabs`
- Agent tool: `browser` with `profile="chrome"`
If you want a different name or a different relay port, create your own profile:
```bash
moltbot browser create-profile \
--name my-chrome \
--driver extension \
--cdp-url http://127.0.0.1:18792 \
--color "#00AA00"
```
## Attach / detach (toolbar button)
- Open the tab you want Moltbot to control.
- Click the extension icon.
- Badge shows `ON` when attached.
- Click again to detach.
## Which tab does it control?
- It does **not** automatically control “whatever tab youre looking at”.
- It controls **only the tab(s) you explicitly attached** by clicking the toolbar button.
- To switch: open the other tab and click the extension icon there.
## Badge + common errors
- `ON`: attached; Moltbot can drive that tab.
- `…`: connecting to the local relay.
- `!`: relay not reachable (most common: browser relay server isnt running on this machine).
If you see `!`:
- Make sure the Gateway is running locally (default setup), or run a node host on this machine if the Gateway runs elsewhere.
- Open the extension Options page; it shows whether the relay is reachable.
## Remote Gateway (use a node host)
### Local Gateway (same machine as Chrome) — usually **no extra steps**
If the Gateway runs on the same machine as Chrome, it starts the browser control service on loopback
and auto-starts the relay server. The extension talks to the local relay; the CLI/tool calls go to the Gateway.
### Remote Gateway (Gateway runs elsewhere) — **run a node host**
If your Gateway runs on another machine, start a node host on the machine that runs Chrome.
The Gateway will proxy browser actions to that node; the extension + relay stay local to the browser machine.
If multiple nodes are connected, pin one with `gateway.nodes.browser.node` or set `gateway.nodes.browser.mode`.
## Sandboxing (tool containers)
If your agent session is sandboxed (`agents.defaults.sandbox.mode != "off"`), the `browser` tool can be restricted:
- By default, sandboxed sessions often target the **sandbox browser** (`target="sandbox"`), not your host Chrome.
- Chrome extension relay takeover requires controlling the **host** browser control server.
Options:
- Easiest: use the extension from a **non-sandboxed** session/agent.
- Or allow host browser control for sandboxed sessions:
```json5
{
agents: {
defaults: {
sandbox: {
browser: {
allowHostControl: true
}
}
}
}
}
```
Then ensure the tool isnt denied by tool policy, and (if needed) call `browser` with `target="host"`.
Debugging: `moltbot sandbox explain`
## Remote access tips
- Keep the Gateway and node host on the same tailnet; avoid exposing relay ports to LAN or public Internet.
- Pair nodes intentionally; disable browser proxy routing if you dont want remote control (`gateway.nodes.browser.mode="off"`).
## How “extension path” works
`moltbot browser extension path` prints the **installed** on-disk directory containing the extension files.
The CLI intentionally does **not** print a `node_modules` path. Always run `moltbot browser extension install` first to copy the extension to a stable location under your Moltbot state directory.
If you move or delete that install directory, Chrome will mark the extension as broken until you reload it from a valid path.
## Security implications (read this)
This is powerful and risky. Treat it like giving the model “hands on your browser”.
- The extension uses Chromes debugger API (`chrome.debugger`). When attached, the model can:
- click/type/navigate in that tab
- read page content
- access whatever the tabs logged-in session can access
- **This is not isolated** like the dedicated clawd-managed profile.
- If you attach to your daily-driver profile/tab, youre granting access to that account state.
Recommendations:
- Prefer a dedicated Chrome profile (separate from your personal browsing) for extension relay usage.
- Keep the Gateway and any node hosts tailnet-only; rely on Gateway auth + node pairing.
- Avoid exposing relay ports over LAN (`0.0.0.0`) and avoid Funnel (public).
Related:
- Browser tool overview: [Browser](/tools/browser)
- Security audit: [Security](/gateway/security)
- Tailscale setup: [Tailscale](/gateway/tailscale)

View File

@@ -0,0 +1,201 @@
---
summary: "ClawdHub guide: public skills registry + CLI workflows"
read_when:
- Introducing ClawdHub to new users
- Installing, searching, or publishing skills
- Explaining ClawdHub CLI flags and sync behavior
---
# ClawdHub
ClawdHub is the **public skill registry for Moltbot**. It is a free service: all skills are public, open, and visible to everyone for sharing and reuse. A skill is just a folder with a `SKILL.md` file (plus supporting text files). You can browse skills in the web app or use the CLI to search, install, update, and publish skills.
Site: [clawdhub.com](https://clawdhub.com)
## Who this is for (beginner-friendly)
If you want to add new capabilities to your Moltbot agent, ClawdHub is the easiest way to find and install skills. You do not need to know how the backend works. You can:
- Search for skills by plain language.
- Install a skill into your workspace.
- Update skills later with one command.
- Back up your own skills by publishing them.
## Quick start (non-technical)
1) Install the CLI (see next section).
2) Search for something you need:
- `clawdhub search "calendar"`
3) Install a skill:
- `clawdhub install <skill-slug>`
4) Start a new Moltbot session so it picks up the new skill.
## Install the CLI
Pick one:
```bash
npm i -g clawdhub
```
```bash
pnpm add -g clawdhub
```
## How it fits into Moltbot
By default, the CLI installs skills into `./skills` under your current working directory. If a Moltbot workspace is configured, `clawdhub` falls back to that workspace unless you override `--workdir` (or `CLAWDHUB_WORKDIR`). Moltbot loads workspace skills from `<workspace>/skills` and will pick them up in the **next** session. If you already use `~/.clawdbot/skills` or bundled skills, workspace skills take precedence.
For more detail on how skills are loaded, shared, and gated, see
[Skills](/tools/skills).
## What the service provides (features)
- **Public browsing** of skills and their `SKILL.md` content.
- **Search** powered by embeddings (vector search), not just keywords.
- **Versioning** with semver, changelogs, and tags (including `latest`).
- **Downloads** as a zip per version.
- **Stars and comments** for community feedback.
- **Moderation** hooks for approvals and audits.
- **CLI-friendly API** for automation and scripting.
## CLI commands and parameters
Global options (apply to all commands):
- `--workdir <dir>`: Working directory (default: current dir; falls back to Moltbot workspace).
- `--dir <dir>`: Skills directory, relative to workdir (default: `skills`).
- `--site <url>`: Site base URL (browser login).
- `--registry <url>`: Registry API base URL.
- `--no-input`: Disable prompts (non-interactive).
- `-V, --cli-version`: Print CLI version.
Auth:
- `clawdhub login` (browser flow) or `clawdhub login --token <token>`
- `clawdhub logout`
- `clawdhub whoami`
Options:
- `--token <token>`: Paste an API token.
- `--label <label>`: Label stored for browser login tokens (default: `CLI token`).
- `--no-browser`: Do not open a browser (requires `--token`).
Search:
- `clawdhub search "query"`
- `--limit <n>`: Max results.
Install:
- `clawdhub install <slug>`
- `--version <version>`: Install a specific version.
- `--force`: Overwrite if the folder already exists.
Update:
- `clawdhub update <slug>`
- `clawdhub update --all`
- `--version <version>`: Update to a specific version (single slug only).
- `--force`: Overwrite when local files do not match any published version.
List:
- `clawdhub list` (reads `.clawdhub/lock.json`)
Publish:
- `clawdhub publish <path>`
- `--slug <slug>`: Skill slug.
- `--name <name>`: Display name.
- `--version <version>`: Semver version.
- `--changelog <text>`: Changelog text (can be empty).
- `--tags <tags>`: Comma-separated tags (default: `latest`).
Delete/undelete (owner/admin only):
- `clawdhub delete <slug> --yes`
- `clawdhub undelete <slug> --yes`
Sync (scan local skills + publish new/updated):
- `clawdhub sync`
- `--root <dir...>`: Extra scan roots.
- `--all`: Upload everything without prompts.
- `--dry-run`: Show what would be uploaded.
- `--bump <type>`: `patch|minor|major` for updates (default: `patch`).
- `--changelog <text>`: Changelog for non-interactive updates.
- `--tags <tags>`: Comma-separated tags (default: `latest`).
- `--concurrency <n>`: Registry checks (default: 4).
## Common workflows for agents
### Search for skills
```bash
clawdhub search "postgres backups"
```
### Download new skills
```bash
clawdhub install my-skill-pack
```
### Update installed skills
```bash
clawdhub update --all
```
### Back up your skills (publish or sync)
For a single skill folder:
```bash
clawdhub publish ./my-skill --slug my-skill --name "My Skill" --version 1.0.0 --tags latest
```
To scan and back up many skills at once:
```bash
clawdhub sync --all
```
## Advanced details (technical)
### Versioning and tags
- Each publish creates a new **semver** `SkillVersion`.
- Tags (like `latest`) point to a version; moving tags lets you roll back.
- Changelogs are attached per version and can be empty when syncing or publishing updates.
### Local changes vs registry versions
Updates compare the local skill contents to registry versions using a content hash. If local files do not match any published version, the CLI asks before overwriting (or requires `--force` in non-interactive runs).
### Sync scanning and fallback roots
`clawdhub sync` scans your current workdir first. If no skills are found, it falls back to known legacy locations (for example `~/moltbot/skills` and `~/.clawdbot/skills`). This is designed to find older skill installs without extra flags.
### Storage and lockfile
- Installed skills are recorded in `.clawdhub/lock.json` under your workdir.
- Auth tokens are stored in the ClawdHub CLI config file (override via `CLAWDHUB_CONFIG_PATH`).
### Telemetry (install counts)
When you run `clawdhub sync` while logged in, the CLI sends a minimal snapshot to compute install counts. You can disable this entirely:
```bash
export CLAWDHUB_DISABLE_TELEMETRY=1
```
## Environment variables
- `CLAWDHUB_SITE`: Override the site URL.
- `CLAWDHUB_REGISTRY`: Override the registry API URL.
- `CLAWDHUB_CONFIG_PATH`: Override where the CLI stores the token/config.
- `CLAWDHUB_WORKDIR`: Override the default workdir.
- `CLAWDHUB_DISABLE_TELEMETRY=1`: Disable telemetry on `sync`.

View File

@@ -0,0 +1,41 @@
# Creating Custom Skills 🛠
Moltbot is designed to be easily extensible. "Skills" are the primary way to add new capabilities to your assistant.
## What is a Skill?
A skill is a directory containing a `SKILL.md` file (which provides instructions and tool definitions to the LLM) and optionally some scripts or resources.
## Step-by-Step: Your First Skill
### 1. Create the Directory
Skills live in your workspace, usually `~/clawd/skills/`. Create a new folder for your skill:
```bash
mkdir -p ~/clawd/skills/hello-world
```
### 2. Define the `SKILL.md`
Create a `SKILL.md` file in that directory. This file uses YAML frontmatter for metadata and Markdown for instructions.
```markdown
---
name: hello_world
description: A simple skill that says hello.
---
# Hello World Skill
When the user asks for a greeting, use the `echo` tool to say "Hello from your custom skill!".
```
### 3. Add Tools (Optional)
You can define custom tools in the frontmatter or instruct the agent to use existing system tools (like `bash` or `browser`).
### 4. Refresh Moltbot
Ask your agent to "refresh skills" or restart the gateway. Moltbot will discover the new directory and index the `SKILL.md`.
## Best Practices
- **Be Concise**: Instruct the model on *what* to do, not how to be an AI.
- **Safety First**: If your skill uses `bash`, ensure the prompts don't allow arbitrary command injection from untrusted user input.
- **Test Locally**: Use `moltbot agent --message "use my new skill"` to test.
## Shared Skills
You can also browse and contribute skills to [ClawdHub](https://clawdhub.com).

View File

@@ -0,0 +1,49 @@
---
summary: "Elevated exec mode and /elevated directives"
read_when:
- Adjusting elevated mode defaults, allowlists, or slash command behavior
---
# Elevated Mode (/elevated directives)
## What it does
- `/elevated on` runs on the gateway host and keeps exec approvals (same as `/elevated ask`).
- `/elevated full` runs on the gateway host **and** auto-approves exec (skips exec approvals).
- `/elevated ask` runs on the gateway host but keeps exec approvals (same as `/elevated on`).
- `on`/`ask` do **not** force `exec.security=full`; configured security/ask policy still applies.
- Only changes behavior when the agent is **sandboxed** (otherwise exec already runs on the host).
- Directive forms: `/elevated on|off|ask|full`, `/elev on|off|ask|full`.
- Only `on|off|ask|full` are accepted; anything else returns a hint and does not change state.
## What it controls (and what it doesnt)
- **Availability gates**: `tools.elevated` is the global baseline. `agents.list[].tools.elevated` can further restrict elevated per agent (both must allow).
- **Per-session state**: `/elevated on|off|ask|full` sets the elevated level for the current session key.
- **Inline directive**: `/elevated on|ask|full` inside a message applies to that message only.
- **Groups**: In group chats, elevated directives are only honored when the agent is mentioned. Command-only messages that bypass mention requirements are treated as mentioned.
- **Host execution**: elevated forces `exec` onto the gateway host; `full` also sets `security=full`.
- **Approvals**: `full` skips exec approvals; `on`/`ask` honor them when allowlist/ask rules require.
- **Unsandboxed agents**: no-op for location; only affects gating, logging, and status.
- **Tool policy still applies**: if `exec` is denied by tool policy, elevated cannot be used.
- **Separate from `/exec`**: `/exec` adjusts per-session defaults for authorized senders and does not require elevated.
## Resolution order
1. Inline directive on the message (applies only to that message).
2. Session override (set by sending a directive-only message).
3. Global default (`agents.defaults.elevatedDefault` in config).
## Setting a session default
- Send a message that is **only** the directive (whitespace allowed), e.g. `/elevated full`.
- Confirmation reply is sent (`Elevated mode set to full...` / `Elevated mode disabled.`).
- If elevated access is disabled or the sender is not on the approved allowlist, the directive replies with an actionable error and does not change session state.
- Send `/elevated` (or `/elevated:`) with no argument to see the current elevated level.
## Availability + allowlists
- Feature gate: `tools.elevated.enabled` (default can be off via config even if the code supports it).
- Sender allowlist: `tools.elevated.allowFrom` with per-provider allowlists (e.g. `discord`, `whatsapp`).
- Per-agent gate: `agents.list[].tools.elevated.enabled` (optional; can only further restrict).
- Per-agent allowlist: `agents.list[].tools.elevated.allowFrom` (optional; when set, the sender must match **both** global + per-agent allowlists).
- Discord fallback: if `tools.elevated.allowFrom.discord` is omitted, the `channels.discord.dm.allowFrom` list is used as a fallback. Set `tools.elevated.allowFrom.discord` (even `[]`) to override. Per-agent allowlists do **not** use the fallback.
- All gates must pass; otherwise elevated is treated as unavailable.
## Logging + status
- Elevated exec calls are logged at info level.
- Session status includes elevated mode (e.g. `elevated=ask`, `elevated=full`).

View File

@@ -0,0 +1,226 @@
---
summary: "Exec approvals, allowlists, and sandbox escape prompts"
read_when:
- Configuring exec approvals or allowlists
- Implementing exec approval UX in the macOS app
- Reviewing sandbox escape prompts and implications
---
# Exec approvals
Exec approvals are the **companion app / node host guardrail** for letting a sandboxed agent run
commands on a real host (`gateway` or `node`). Think of it like a safety interlock:
commands are allowed only when policy + allowlist + (optional) user approval all agree.
Exec approvals are **in addition** to tool policy and elevated gating (unless elevated is set to `full`, which skips approvals).
Effective policy is the **stricter** of `tools.exec.*` and approvals defaults; if an approvals field is omitted, the `tools.exec` value is used.
If the companion app UI is **not available**, any request that requires a prompt is
resolved by the **ask fallback** (default: deny).
## Where it applies
Exec approvals are enforced locally on the execution host:
- **gateway host** → `moltbot` process on the gateway machine
- **node host** → node runner (macOS companion app or headless node host)
macOS split:
- **node host service** forwards `system.run` to the **macOS app** over local IPC.
- **macOS app** enforces approvals + executes the command in UI context.
## Settings and storage
Approvals live in a local JSON file on the execution host:
`~/.clawdbot/exec-approvals.json`
Example schema:
```json
{
"version": 1,
"socket": {
"path": "~/.clawdbot/exec-approvals.sock",
"token": "base64url-token"
},
"defaults": {
"security": "deny",
"ask": "on-miss",
"askFallback": "deny",
"autoAllowSkills": false
},
"agents": {
"main": {
"security": "allowlist",
"ask": "on-miss",
"askFallback": "deny",
"autoAllowSkills": true,
"allowlist": [
{
"id": "B0C8C0B3-2C2D-4F8A-9A3C-5A4B3C2D1E0F",
"pattern": "~/Projects/**/bin/rg",
"lastUsedAt": 1737150000000,
"lastUsedCommand": "rg -n TODO",
"lastResolvedPath": "/Users/user/Projects/.../bin/rg"
}
]
}
}
}
```
## Policy knobs
### Security (`exec.security`)
- **deny**: block all host exec requests.
- **allowlist**: allow only allowlisted commands.
- **full**: allow everything (equivalent to elevated).
### Ask (`exec.ask`)
- **off**: never prompt.
- **on-miss**: prompt only when allowlist does not match.
- **always**: prompt on every command.
### Ask fallback (`askFallback`)
If a prompt is required but no UI is reachable, fallback decides:
- **deny**: block.
- **allowlist**: allow only if allowlist matches.
- **full**: allow.
## Allowlist (per agent)
Allowlists are **per agent**. If multiple agents exist, switch which agent youre
editing in the macOS app. Patterns are **case-insensitive glob matches**.
Patterns should resolve to **binary paths** (basename-only entries are ignored).
Legacy `agents.default` entries are migrated to `agents.main` on load.
Examples:
- `~/Projects/**/bin/bird`
- `~/.local/bin/*`
- `/opt/homebrew/bin/rg`
Each allowlist entry tracks:
- **id** stable UUID used for UI identity (optional)
- **last used** timestamp
- **last used command**
- **last resolved path**
## Auto-allow skill CLIs
When **Auto-allow skill CLIs** is enabled, executables referenced by known skills
are treated as allowlisted on nodes (macOS node or headless node host). This uses
`skills.bins` over the Gateway RPC to fetch the skill bin list. Disable this if you want strict manual allowlists.
## Safe bins (stdin-only)
`tools.exec.safeBins` defines a small list of **stdin-only** binaries (for example `jq`)
that can run in allowlist mode **without** explicit allowlist entries. Safe bins reject
positional file args and path-like tokens, so they can only operate on the incoming stream.
Shell chaining and redirections are not auto-allowed in allowlist mode.
Shell chaining (`&&`, `||`, `;`) is allowed when every top-level segment satisfies the allowlist
(including safe bins or skill auto-allow). Redirections remain unsupported in allowlist mode.
Default safe bins: `jq`, `grep`, `cut`, `sort`, `uniq`, `head`, `tail`, `tr`, `wc`.
## Control UI editing
Use the **Control UI → Nodes → Exec approvals** card to edit defaults, peragent
overrides, and allowlists. Pick a scope (Defaults or an agent), tweak the policy,
add/remove allowlist patterns, then **Save**. The UI shows **last used** metadata
per pattern so you can keep the list tidy.
The target selector chooses **Gateway** (local approvals) or a **Node**. Nodes
must advertise `system.execApprovals.get/set` (macOS app or headless node host).
If a node does not advertise exec approvals yet, edit its local
`~/.clawdbot/exec-approvals.json` directly.
CLI: `moltbot approvals` supports gateway or node editing (see [Approvals CLI](/cli/approvals)).
## Approval flow
When a prompt is required, the gateway broadcasts `exec.approval.requested` to operator clients.
The Control UI and macOS app resolve it via `exec.approval.resolve`, then the gateway forwards the
approved request to the node host.
When approvals are required, the exec tool returns immediately with an approval id. Use that id to
correlate later system events (`Exec finished` / `Exec denied`). If no decision arrives before the
timeout, the request is treated as an approval timeout and surfaced as a denial reason.
The confirmation dialog includes:
- command + args
- cwd
- agent id
- resolved executable path
- host + policy metadata
Actions:
- **Allow once** → run now
- **Always allow** → add to allowlist + run
- **Deny** → block
## Approval forwarding to chat channels
You can forward exec approval prompts to any chat channel (including plugin channels) and approve
them with `/approve`. This uses the normal outbound delivery pipeline.
Config:
```json5
{
approvals: {
exec: {
enabled: true,
mode: "session", // "session" | "targets" | "both"
agentFilter: ["main"],
sessionFilter: ["discord"], // substring or regex
targets: [
{ channel: "slack", to: "U12345678" },
{ channel: "telegram", to: "123456789" }
]
}
}
}
```
Reply in chat:
```
/approve <id> allow-once
/approve <id> allow-always
/approve <id> deny
```
### macOS IPC flow
```
Gateway -> Node Service (WS)
| IPC (UDS + token + HMAC + TTL)
v
Mac App (UI + approvals + system.run)
```
Security notes:
- Unix socket mode `0600`, token stored in `exec-approvals.json`.
- Same-UID peer check.
- Challenge/response (nonce + HMAC token + request hash) + short TTL.
## System events
Exec lifecycle is surfaced as system messages:
- `Exec running` (only if the command exceeds the running notice threshold)
- `Exec finished`
- `Exec denied`
These are posted to the agents session after the node reports the event.
Gateway-host exec approvals emit the same lifecycle events when the command finishes (and optionally when running longer than the threshold).
Approval-gated execs reuse the approval id as the `runId` in these messages for easy correlation.
## Implications
- **full** is powerful; prefer allowlists when possible.
- **ask** keeps you in the loop while still allowing fast approvals.
- Per-agent allowlists prevent one agents approvals from leaking into others.
- Approvals only apply to host exec requests from **authorized senders**. Unauthorized senders cannot issue `/exec`.
- `/exec security=full` is a session-level convenience for authorized operators and skips approvals by design.
To hard-block host exec, set approvals security to `deny` or deny the `exec` tool via tool policy.
Related:
- [Exec tool](/tools/exec)
- [Elevated mode](/tools/elevated)
- [Skills](/tools/skills)

View File

@@ -0,0 +1,167 @@
---
summary: "Exec tool usage, stdin modes, and TTY support"
read_when:
- Using or modifying the exec tool
- Debugging stdin or TTY behavior
---
# Exec tool
Run shell commands in the workspace. Supports foreground + background execution via `process`.
If `process` is disallowed, `exec` runs synchronously and ignores `yieldMs`/`background`.
Background sessions are scoped per agent; `process` only sees sessions from the same agent.
## Parameters
- `command` (required)
- `workdir` (defaults to cwd)
- `env` (key/value overrides)
- `yieldMs` (default 10000): auto-background after delay
- `background` (bool): background immediately
- `timeout` (seconds, default 1800): kill on expiry
- `pty` (bool): run in a pseudo-terminal when available (TTY-only CLIs, coding agents, terminal UIs)
- `host` (`sandbox | gateway | node`): where to execute
- `security` (`deny | allowlist | full`): enforcement mode for `gateway`/`node`
- `ask` (`off | on-miss | always`): approval prompts for `gateway`/`node`
- `node` (string): node id/name for `host=node`
- `elevated` (bool): request elevated mode (gateway host); `security=full` is only forced when elevated resolves to `full`
Notes:
- `host` defaults to `sandbox`.
- `elevated` is ignored when sandboxing is off (exec already runs on the host).
- `gateway`/`node` approvals are controlled by `~/.clawdbot/exec-approvals.json`.
- `node` requires a paired node (companion app or headless node host).
- If multiple nodes are available, set `exec.node` or `tools.exec.node` to select one.
- On non-Windows hosts, exec uses `SHELL` when set; if `SHELL` is `fish`, it prefers `bash` (or `sh`)
from `PATH` to avoid fish-incompatible scripts, then falls back to `SHELL` if neither exists.
- Important: sandboxing is **off by default**. If sandboxing is off, `host=sandbox` runs directly on
the gateway host (no container) and **does not require approvals**. To require approvals, run with
`host=gateway` and configure exec approvals (or enable sandboxing).
## Config
- `tools.exec.notifyOnExit` (default: true): when true, backgrounded exec sessions enqueue a system event and request a heartbeat on exit.
- `tools.exec.approvalRunningNoticeMs` (default: 10000): emit a single “running” notice when an approval-gated exec runs longer than this (0 disables).
- `tools.exec.host` (default: `sandbox`)
- `tools.exec.security` (default: `deny` for sandbox, `allowlist` for gateway + node when unset)
- `tools.exec.ask` (default: `on-miss`)
- `tools.exec.node` (default: unset)
- `tools.exec.pathPrepend`: list of directories to prepend to `PATH` for exec runs.
- `tools.exec.safeBins`: stdin-only safe binaries that can run without explicit allowlist entries.
Example:
```json5
{
tools: {
exec: {
pathPrepend: ["~/bin", "/opt/oss/bin"]
}
}
}
```
### PATH handling
- `host=gateway`: merges your login-shell `PATH` into the exec environment (unless the exec call
already sets `env.PATH`). The daemon itself still runs with a minimal `PATH`:
- macOS: `/opt/homebrew/bin`, `/usr/local/bin`, `/usr/bin`, `/bin`
- Linux: `/usr/local/bin`, `/usr/bin`, `/bin`
- `host=sandbox`: runs `sh -lc` (login shell) inside the container, so `/etc/profile` may reset `PATH`.
Moltbot prepends `env.PATH` after profile sourcing via an internal env var (no shell interpolation);
`tools.exec.pathPrepend` applies here too.
- `host=node`: only env overrides you pass are sent to the node. `tools.exec.pathPrepend` only applies
if the exec call already sets `env.PATH`. Headless node hosts accept `PATH` only when it prepends
the node host PATH (no replacement). macOS nodes drop `PATH` overrides entirely.
Per-agent node binding (use the agent list index in config):
```bash
moltbot config get agents.list
moltbot config set agents.list[0].tools.exec.node "node-id-or-name"
```
Control UI: the Nodes tab includes a small “Exec node binding” panel for the same settings.
## Session overrides (`/exec`)
Use `/exec` to set **per-session** defaults for `host`, `security`, `ask`, and `node`.
Send `/exec` with no arguments to show the current values.
Example:
```
/exec host=gateway security=allowlist ask=on-miss node=mac-1
```
## Authorization model
`/exec` is only honored for **authorized senders** (channel allowlists/pairing plus `commands.useAccessGroups`).
It updates **session state only** and does not write config. To hard-disable exec, deny it via tool
policy (`tools.deny: ["exec"]` or per-agent). Host approvals still apply unless you explicitly set
`security=full` and `ask=off`.
## Exec approvals (companion app / node host)
Sandboxed agents can require per-request approval before `exec` runs on the gateway or node host.
See [Exec approvals](/tools/exec-approvals) for the policy, allowlist, and UI flow.
When approvals are required, the exec tool returns immediately with
`status: "approval-pending"` and an approval id. Once approved (or denied / timed out),
the Gateway emits system events (`Exec finished` / `Exec denied`). If the command is still
running after `tools.exec.approvalRunningNoticeMs`, a single `Exec running` notice is emitted.
## Allowlist + safe bins
Allowlist enforcement matches **resolved binary paths only** (no basename matches). When
`security=allowlist`, shell commands are auto-allowed only if every pipeline segment is
allowlisted or a safe bin. Chaining (`;`, `&&`, `||`) and redirections are rejected in
allowlist mode.
## Examples
Foreground:
```json
{"tool":"exec","command":"ls -la"}
```
Background + poll:
```json
{"tool":"exec","command":"npm run build","yieldMs":1000}
{"tool":"process","action":"poll","sessionId":"<id>"}
```
Send keys (tmux-style):
```json
{"tool":"process","action":"send-keys","sessionId":"<id>","keys":["Enter"]}
{"tool":"process","action":"send-keys","sessionId":"<id>","keys":["C-c"]}
{"tool":"process","action":"send-keys","sessionId":"<id>","keys":["Up","Up","Enter"]}
```
Submit (send CR only):
```json
{"tool":"process","action":"submit","sessionId":"<id>"}
```
Paste (bracketed by default):
```json
{"tool":"process","action":"paste","sessionId":"<id>","text":"line1\nline2\n"}
```
## apply_patch (experimental)
`apply_patch` is a subtool of `exec` for structured multi-file edits.
Enable it explicitly:
```json5
{
tools: {
exec: {
applyPatch: { enabled: true, allowModels: ["gpt-5.2"] }
}
}
}
```
Notes:
- Only available for OpenAI/OpenAI Codex models.
- Tool policy still applies; `allow: ["exec"]` implicitly allows `apply_patch`.
- Config lives under `tools.exec.applyPatch`.

View File

@@ -0,0 +1,58 @@
---
summary: "Firecrawl fallback for web_fetch (anti-bot + cached extraction)"
read_when:
- You want Firecrawl-backed web extraction
- You need a Firecrawl API key
- You want anti-bot extraction for web_fetch
---
# Firecrawl
Moltbot can use **Firecrawl** as a fallback extractor for `web_fetch`. It is a hosted
content extraction service that supports bot circumvention and caching, which helps
with JS-heavy sites or pages that block plain HTTP fetches.
## Get an API key
1) Create a Firecrawl account and generate an API key.
2) Store it in config or set `FIRECRAWL_API_KEY` in the gateway environment.
## Configure Firecrawl
```json5
{
tools: {
web: {
fetch: {
firecrawl: {
apiKey: "FIRECRAWL_API_KEY_HERE",
baseUrl: "https://api.firecrawl.dev",
onlyMainContent: true,
maxAgeMs: 172800000,
timeoutSeconds: 60
}
}
}
}
}
```
Notes:
- `firecrawl.enabled` defaults to true when an API key is present.
- `maxAgeMs` controls how old cached results can be (ms). Default is 2 days.
## Stealth / bot circumvention
Firecrawl exposes a **proxy mode** parameter for bot circumvention (`basic`, `stealth`, or `auto`).
Moltbot always uses `proxy: "auto"` plus `storeInCache: true` for Firecrawl requests.
If proxy is omitted, Firecrawl defaults to `auto`. `auto` retries with stealth proxies if a basic attempt fails, which may use more credits
than basic-only scraping.
## How `web_fetch` uses Firecrawl
`web_fetch` extraction order:
1) Readability (local)
2) Firecrawl (if configured)
3) Basic HTML cleanup (last fallback)
See [Web tools](/tools/web) for the full web tool setup.

View File

@@ -0,0 +1,450 @@
---
summary: "Agent tool surface for Moltbot (browser, canvas, nodes, message, cron) replacing legacy `moltbot-*` skills"
read_when:
- Adding or modifying agent tools
- Retiring or changing `moltbot-*` skills
---
# Tools (Moltbot)
Moltbot exposes **first-class agent tools** for browser, canvas, nodes, and cron.
These replace the old `moltbot-*` skills: the tools are typed, no shelling,
and the agent should rely on them directly.
## Disabling tools
You can globally allow/deny tools via `tools.allow` / `tools.deny` in `moltbot.json`
(deny wins). This prevents disallowed tools from being sent to model providers.
```json5
{
tools: { deny: ["browser"] }
}
```
Notes:
- Matching is case-insensitive.
- `*` wildcards are supported (`"*"` means all tools).
- If `tools.allow` only references unknown or unloaded plugin tool names, Moltbot logs a warning and ignores the allowlist so core tools stay available.
## Tool profiles (base allowlist)
`tools.profile` sets a **base tool allowlist** before `tools.allow`/`tools.deny`.
Per-agent override: `agents.list[].tools.profile`.
Profiles:
- `minimal`: `session_status` only
- `coding`: `group:fs`, `group:runtime`, `group:sessions`, `group:memory`, `image`
- `messaging`: `group:messaging`, `sessions_list`, `sessions_history`, `sessions_send`, `session_status`
- `full`: no restriction (same as unset)
Example (messaging-only by default, allow Slack + Discord tools too):
```json5
{
tools: {
profile: "messaging",
allow: ["slack", "discord"]
}
}
```
Example (coding profile, but deny exec/process everywhere):
```json5
{
tools: {
profile: "coding",
deny: ["group:runtime"]
}
}
```
Example (global coding profile, messaging-only support agent):
```json5
{
tools: { profile: "coding" },
agents: {
list: [
{
id: "support",
tools: { profile: "messaging", allow: ["slack"] }
}
]
}
}
```
## Provider-specific tool policy
Use `tools.byProvider` to **further restrict** tools for specific providers
(or a single `provider/model`) without changing your global defaults.
Per-agent override: `agents.list[].tools.byProvider`.
This is applied **after** the base tool profile and **before** allow/deny lists,
so it can only narrow the tool set.
Provider keys accept either `provider` (e.g. `google-antigravity`) or
`provider/model` (e.g. `openai/gpt-5.2`).
Example (keep global coding profile, but minimal tools for Google Antigravity):
```json5
{
tools: {
profile: "coding",
byProvider: {
"google-antigravity": { profile: "minimal" }
}
}
}
```
Example (provider/model-specific allowlist for a flaky endpoint):
```json5
{
tools: {
allow: ["group:fs", "group:runtime", "sessions_list"],
byProvider: {
"openai/gpt-5.2": { allow: ["group:fs", "sessions_list"] }
}
}
}
```
Example (agent-specific override for a single provider):
```json5
{
agents: {
list: [
{
id: "support",
tools: {
byProvider: {
"google-antigravity": { allow: ["message", "sessions_list"] }
}
}
}
]
}
}
```
## Tool groups (shorthands)
Tool policies (global, agent, sandbox) support `group:*` entries that expand to multiple tools.
Use these in `tools.allow` / `tools.deny`.
Available groups:
- `group:runtime`: `exec`, `bash`, `process`
- `group:fs`: `read`, `write`, `edit`, `apply_patch`
- `group:sessions`: `sessions_list`, `sessions_history`, `sessions_send`, `sessions_spawn`, `session_status`
- `group:memory`: `memory_search`, `memory_get`
- `group:web`: `web_search`, `web_fetch`
- `group:ui`: `browser`, `canvas`
- `group:automation`: `cron`, `gateway`
- `group:messaging`: `message`
- `group:nodes`: `nodes`
- `group:moltbot`: all built-in Moltbot tools (excludes provider plugins)
Example (allow only file tools + browser):
```json5
{
tools: {
allow: ["group:fs", "browser"]
}
}
```
## Plugins + tools
Plugins can register **additional tools** (and CLI commands) beyond the core set.
See [Plugins](/plugin) for install + config, and [Skills](/tools/skills) for how
tool usage guidance is injected into prompts. Some plugins ship their own skills
alongside tools (for example, the voice-call plugin).
Optional plugin tools:
- [Lobster](/tools/lobster): typed workflow runtime with resumable approvals (requires the Lobster CLI on the gateway host).
- [LLM Task](/tools/llm-task): JSON-only LLM step for structured workflow output (optional schema validation).
## Tool inventory
### `apply_patch`
Apply structured patches across one or more files. Use for multi-hunk edits.
Experimental: enable via `tools.exec.applyPatch.enabled` (OpenAI models only).
### `exec`
Run shell commands in the workspace.
Core parameters:
- `command` (required)
- `yieldMs` (auto-background after timeout, default 10000)
- `background` (immediate background)
- `timeout` (seconds; kills the process if exceeded, default 1800)
- `elevated` (bool; run on host if elevated mode is enabled/allowed; only changes behavior when the agent is sandboxed)
- `host` (`sandbox | gateway | node`)
- `security` (`deny | allowlist | full`)
- `ask` (`off | on-miss | always`)
- `node` (node id/name for `host=node`)
- Need a real TTY? Set `pty: true`.
Notes:
- Returns `status: "running"` with a `sessionId` when backgrounded.
- Use `process` to poll/log/write/kill/clear background sessions.
- If `process` is disallowed, `exec` runs synchronously and ignores `yieldMs`/`background`.
- `elevated` is gated by `tools.elevated` plus any `agents.list[].tools.elevated` override (both must allow) and is an alias for `host=gateway` + `security=full`.
- `elevated` only changes behavior when the agent is sandboxed (otherwise its a no-op).
- `host=node` can target a macOS companion app or a headless node host (`moltbot node run`).
- gateway/node approvals and allowlists: [Exec approvals](/tools/exec-approvals).
### `process`
Manage background exec sessions.
Core actions:
- `list`, `poll`, `log`, `write`, `kill`, `clear`, `remove`
Notes:
- `poll` returns new output and exit status when complete.
- `log` supports line-based `offset`/`limit` (omit `offset` to grab the last N lines).
- `process` is scoped per agent; sessions from other agents are not visible.
### `web_search`
Search the web using Brave Search API.
Core parameters:
- `query` (required)
- `count` (110; default from `tools.web.search.maxResults`)
Notes:
- Requires a Brave API key (recommended: `moltbot configure --section web`, or set `BRAVE_API_KEY`).
- Enable via `tools.web.search.enabled`.
- Responses are cached (default 15 min).
- See [Web tools](/tools/web) for setup.
### `web_fetch`
Fetch and extract readable content from a URL (HTML → markdown/text).
Core parameters:
- `url` (required)
- `extractMode` (`markdown` | `text`)
- `maxChars` (truncate long pages)
Notes:
- Enable via `tools.web.fetch.enabled`.
- Responses are cached (default 15 min).
- For JS-heavy sites, prefer the browser tool.
- See [Web tools](/tools/web) for setup.
- See [Firecrawl](/tools/firecrawl) for the optional anti-bot fallback.
### `browser`
Control the dedicated clawd browser.
Core actions:
- `status`, `start`, `stop`, `tabs`, `open`, `focus`, `close`
- `snapshot` (aria/ai)
- `screenshot` (returns image block + `MEDIA:<path>`)
- `act` (UI actions: click/type/press/hover/drag/select/fill/resize/wait/evaluate)
- `navigate`, `console`, `pdf`, `upload`, `dialog`
Profile management:
- `profiles` — list all browser profiles with status
- `create-profile` — create new profile with auto-allocated port (or `cdpUrl`)
- `delete-profile` — stop browser, delete user data, remove from config (local only)
- `reset-profile` — kill orphan process on profile's port (local only)
Common parameters:
- `profile` (optional; defaults to `browser.defaultProfile`)
- `target` (`sandbox` | `host` | `node`)
- `node` (optional; picks a specific node id/name)
Notes:
- Requires `browser.enabled=true` (default is `true`; set `false` to disable).
- All actions accept optional `profile` parameter for multi-instance support.
- When `profile` is omitted, uses `browser.defaultProfile` (defaults to "chrome").
- Profile names: lowercase alphanumeric + hyphens only (max 64 chars).
- Port range: 18800-18899 (~100 profiles max).
- Remote profiles are attach-only (no start/stop/reset).
- If a browser-capable node is connected, the tool may auto-route to it (unless you pin `target`).
- `snapshot` defaults to `ai` when Playwright is installed; use `aria` for the accessibility tree.
- `snapshot` also supports role-snapshot options (`interactive`, `compact`, `depth`, `selector`) which return refs like `e12`.
- `act` requires `ref` from `snapshot` (numeric `12` from AI snapshots, or `e12` from role snapshots); use `evaluate` for rare CSS selector needs.
- Avoid `act``wait` by default; use it only in exceptional cases (no reliable UI state to wait on).
- `upload` can optionally pass a `ref` to auto-click after arming.
- `upload` also supports `inputRef` (aria ref) or `element` (CSS selector) to set `<input type="file">` directly.
### `canvas`
Drive the node Canvas (present, eval, snapshot, A2UI).
Core actions:
- `present`, `hide`, `navigate`, `eval`
- `snapshot` (returns image block + `MEDIA:<path>`)
- `a2ui_push`, `a2ui_reset`
Notes:
- Uses gateway `node.invoke` under the hood.
- If no `node` is provided, the tool picks a default (single connected node or local mac node).
- A2UI is v0.8 only (no `createSurface`); the CLI rejects v0.9 JSONL with line errors.
- Quick smoke: `moltbot nodes canvas a2ui push --node <id> --text "Hello from A2UI"`.
### `nodes`
Discover and target paired nodes; send notifications; capture camera/screen.
Core actions:
- `status`, `describe`
- `pending`, `approve`, `reject` (pairing)
- `notify` (macOS `system.notify`)
- `run` (macOS `system.run`)
- `camera_snap`, `camera_clip`, `screen_record`
- `location_get`
Notes:
- Camera/screen commands require the node app to be foregrounded.
- Images return image blocks + `MEDIA:<path>`.
- Videos return `FILE:<path>` (mp4).
- Location returns a JSON payload (lat/lon/accuracy/timestamp).
- `run` params: `command` argv array; optional `cwd`, `env` (`KEY=VAL`), `commandTimeoutMs`, `invokeTimeoutMs`, `needsScreenRecording`.
Example (`run`):
```json
{
"action": "run",
"node": "office-mac",
"command": ["echo", "Hello"],
"env": ["FOO=bar"],
"commandTimeoutMs": 12000,
"invokeTimeoutMs": 45000,
"needsScreenRecording": false
}
```
### `image`
Analyze an image with the configured image model.
Core parameters:
- `image` (required path or URL)
- `prompt` (optional; defaults to "Describe the image.")
- `model` (optional override)
- `maxBytesMb` (optional size cap)
Notes:
- Only available when `agents.defaults.imageModel` is configured (primary or fallbacks), or when an implicit image model can be inferred from your default model + configured auth (best-effort pairing).
- Uses the image model directly (independent of the main chat model).
### `message`
Send messages and channel actions across Discord/Google Chat/Slack/Telegram/WhatsApp/Signal/iMessage/MS Teams.
Core actions:
- `send` (text + optional media; MS Teams also supports `card` for Adaptive Cards)
- `poll` (WhatsApp/Discord/MS Teams polls)
- `react` / `reactions` / `read` / `edit` / `delete`
- `pin` / `unpin` / `list-pins`
- `permissions`
- `thread-create` / `thread-list` / `thread-reply`
- `search`
- `sticker`
- `member-info` / `role-info`
- `emoji-list` / `emoji-upload` / `sticker-upload`
- `role-add` / `role-remove`
- `channel-info` / `channel-list`
- `voice-status`
- `event-list` / `event-create`
- `timeout` / `kick` / `ban`
Notes:
- `send` routes WhatsApp via the Gateway; other channels go direct.
- `poll` uses the Gateway for WhatsApp and MS Teams; Discord polls go direct.
- When a message tool call is bound to an active chat session, sends are constrained to that sessions target to avoid cross-context leaks.
### `cron`
Manage Gateway cron jobs and wakeups.
Core actions:
- `status`, `list`
- `add`, `update`, `remove`, `run`, `runs`
- `wake` (enqueue system event + optional immediate heartbeat)
Notes:
- `add` expects a full cron job object (same schema as `cron.add` RPC).
- `update` uses `{ id, patch }`.
### `gateway`
Restart or apply updates to the running Gateway process (in-place).
Core actions:
- `restart` (authorizes + sends `SIGUSR1` for in-process restart; `moltbot gateway` restart in-place)
- `config.get` / `config.schema`
- `config.apply` (validate + write config + restart + wake)
- `config.patch` (merge partial update + restart + wake)
- `update.run` (run update + restart + wake)
Notes:
- Use `delayMs` (defaults to 2000) to avoid interrupting an in-flight reply.
- `restart` is disabled by default; enable with `commands.restart: true`.
### `sessions_list` / `sessions_history` / `sessions_send` / `sessions_spawn` / `session_status`
List sessions, inspect transcript history, or send to another session.
Core parameters:
- `sessions_list`: `kinds?`, `limit?`, `activeMinutes?`, `messageLimit?` (0 = none)
- `sessions_history`: `sessionKey` (or `sessionId`), `limit?`, `includeTools?`
- `sessions_send`: `sessionKey` (or `sessionId`), `message`, `timeoutSeconds?` (0 = fire-and-forget)
- `sessions_spawn`: `task`, `label?`, `agentId?`, `model?`, `runTimeoutSeconds?`, `cleanup?`
- `session_status`: `sessionKey?` (default current; accepts `sessionId`), `model?` (`default` clears override)
Notes:
- `main` is the canonical direct-chat key; global/unknown are hidden.
- `messageLimit > 0` fetches last N messages per session (tool messages filtered).
- `sessions_send` waits for final completion when `timeoutSeconds > 0`.
- Delivery/announce happens after completion and is best-effort; `status: "ok"` confirms the agent run finished, not that the announce was delivered.
- `sessions_spawn` starts a sub-agent run and posts an announce reply back to the requester chat.
- `sessions_spawn` is non-blocking and returns `status: "accepted"` immediately.
- `sessions_send` runs a replyback pingpong (reply `REPLY_SKIP` to stop; max turns via `session.agentToAgent.maxPingPongTurns`, 05).
- After the pingpong, the target agent runs an **announce step**; reply `ANNOUNCE_SKIP` to suppress the announcement.
### `agents_list`
List agent ids that the current session may target with `sessions_spawn`.
Notes:
- Result is restricted to per-agent allowlists (`agents.list[].subagents.allowAgents`).
- When `["*"]` is configured, the tool includes all configured agents and marks `allowAny: true`.
## Parameters (common)
Gateway-backed tools (`canvas`, `nodes`, `cron`):
- `gatewayUrl` (default `ws://127.0.0.1:18789`)
- `gatewayToken` (if auth enabled)
- `timeoutMs`
Browser tool:
- `profile` (optional; defaults to `browser.defaultProfile`)
- `target` (`sandbox` | `host` | `node`)
- `node` (optional; pin a specific node id/name)
## Recommended agent flows
Browser automation:
1) `browser``status` / `start`
2) `snapshot` (ai or aria)
3) `act` (click/type/press)
4) `screenshot` if you need visual confirmation
Canvas render:
1) `canvas``present`
2) `a2ui_push` (optional)
3) `snapshot`
Node targeting:
1) `nodes``status`
2) `describe` on the chosen node
3) `notify` / `run` / `camera_snap` / `screen_record`
## Safety
- Avoid direct `system.run`; use `nodes``run` only with explicit user consent.
- Respect user consent for camera/screen capture.
- Use `status/describe` to ensure permissions before invoking media commands.
## How tools are presented to the agent
Tools are exposed in two parallel channels:
1) **System prompt text**: a human-readable list + guidance.
2) **Tool schema**: the structured function definitions sent to the model API.
That means the agent sees both “what tools exist” and “how to call them.” If a tool
doesnt appear in the system prompt or the schema, the model cannot call it.

View File

@@ -0,0 +1,114 @@
---
summary: "JSON-only LLM tasks for workflows (optional plugin tool)"
read_when:
- You want a JSON-only LLM step inside workflows
- You need schema-validated LLM output for automation
---
# LLM Task
`llm-task` is an **optional plugin tool** that runs a JSON-only LLM task and
returns structured output (optionally validated against JSON Schema).
This is ideal for workflow engines like Lobster: you can add a single LLM step
without writing custom Moltbot code for each workflow.
## Enable the plugin
1) Enable the plugin:
```json
{
"plugins": {
"entries": {
"llm-task": { "enabled": true }
}
}
}
```
2) Allowlist the tool (it is registered with `optional: true`):
```json
{
"agents": {
"list": [
{
"id": "main",
"tools": { "allow": ["llm-task"] }
}
]
}
}
```
## Config (optional)
```json
{
"plugins": {
"entries": {
"llm-task": {
"enabled": true,
"config": {
"defaultProvider": "openai-codex",
"defaultModel": "gpt-5.2",
"defaultAuthProfileId": "main",
"allowedModels": ["openai-codex/gpt-5.2"],
"maxTokens": 800,
"timeoutMs": 30000
}
}
}
}
}
```
`allowedModels` is an allowlist of `provider/model` strings. If set, any request
outside the list is rejected.
## Tool parameters
- `prompt` (string, required)
- `input` (any, optional)
- `schema` (object, optional JSON Schema)
- `provider` (string, optional)
- `model` (string, optional)
- `authProfileId` (string, optional)
- `temperature` (number, optional)
- `maxTokens` (number, optional)
- `timeoutMs` (number, optional)
## Output
Returns `details.json` containing the parsed JSON (and validates against
`schema` when provided).
## Example: Lobster workflow step
```lobster
clawd.invoke --tool llm-task --action json --args-json '{
"prompt": "Given the input email, return intent and draft.",
"input": {
"subject": "Hello",
"body": "Can you help?"
},
"schema": {
"type": "object",
"properties": {
"intent": { "type": "string" },
"draft": { "type": "string" }
},
"required": ["intent", "draft"],
"additionalProperties": false
}
}'
```
## Safety notes
- The tool is **JSON-only** and instructs the model to output only JSON (no
code fences, no commentary).
- No tools are exposed to the model for this run.
- Treat output as untrusted unless you validate with `schema`.
- Put approvals before any side-effecting step (send, post, exec).

View File

@@ -0,0 +1,338 @@
---
title: Lobster
summary: "Typed workflow runtime for Moltbot with resumable approval gates."
description: Typed workflow runtime for Moltbot — composable pipelines with approval gates.
read_when:
- You want deterministic multi-step workflows with explicit approvals
- You need to resume a workflow without re-running earlier steps
---
# Lobster
Lobster is a workflow shell that lets Moltbot run multi-step tool sequences as a single, deterministic operation with explicit approval checkpoints.
## Hook
Your assistant can build the tools that manage itself. Ask for a workflow, and 30 minutes later you have a CLI plus pipelines that run as one call. Lobster is the missing piece: deterministic pipelines, explicit approvals, and resumable state.
## Why
Today, complex workflows require many back-and-forth tool calls. Each call costs tokens, and the LLM has to orchestrate every step. Lobster moves that orchestration into a typed runtime:
- **One call instead of many**: Moltbot runs one Lobster tool call and gets a structured result.
- **Approvals built in**: Side effects (send email, post comment) halt the workflow until explicitly approved.
- **Resumable**: Halted workflows return a token; approve and resume without re-running everything.
## Why a DSL instead of plain programs?
Lobster is intentionally small. The goal is not "a new language," it's a predictable, AI-friendly pipeline spec with first-class approvals and resume tokens.
- **Approve/resume is built in**: A normal program can prompt a human, but it cant *pause and resume* with a durable token without you inventing that runtime yourself.
- **Determinism + auditability**: Pipelines are data, so theyre easy to log, diff, replay, and review.
- **Constrained surface for AI**: A tiny grammar + JSON piping reduces “creative” code paths and makes validation realistic.
- **Safety policy baked in**: Timeouts, output caps, sandbox checks, and allowlists are enforced by the runtime, not each script.
- **Still programmable**: Each step can call any CLI or script. If you want JS/TS, generate `.lobster` files from code.
## How it works
Moltbot launches the local `lobster` CLI in **tool mode** and parses a JSON envelope from stdout.
If the pipeline pauses for approval, the tool returns a `resumeToken` so you can continue later.
## Pattern: small CLI + JSON pipes + approvals
Build tiny commands that speak JSON, then chain them into a single Lobster call. (Example command names below — swap in your own.)
```bash
inbox list --json
inbox categorize --json
inbox apply --json
```
```json
{
"action": "run",
"pipeline": "exec --json --shell 'inbox list --json' | exec --stdin json --shell 'inbox categorize --json' | exec --stdin json --shell 'inbox apply --json' | approve --preview-from-stdin --limit 5 --prompt 'Apply changes?'",
"timeoutMs": 30000
}
```
If the pipeline requests approval, resume with the token:
```json
{
"action": "resume",
"token": "<resumeToken>",
"approve": true
}
```
AI triggers the workflow; Lobster executes the steps. Approval gates keep side effects explicit and auditable.
Example: map input items into tool calls:
```bash
gog.gmail.search --query 'newer_than:1d' \
| clawd.invoke --tool message --action send --each --item-key message --args-json '{"provider":"telegram","to":"..."}'
```
## JSON-only LLM steps (llm-task)
For workflows that need a **structured LLM step**, enable the optional
`llm-task` plugin tool and call it from Lobster. This keeps the workflow
deterministic while still letting you classify/summarize/draft with a model.
Enable the tool:
```json
{
"plugins": {
"entries": {
"llm-task": { "enabled": true }
}
},
"agents": {
"list": [
{
"id": "main",
"tools": { "allow": ["llm-task"] }
}
]
}
}
```
Use it in a pipeline:
```lobster
clawd.invoke --tool llm-task --action json --args-json '{
"prompt": "Given the input email, return intent and draft.",
"input": { "subject": "Hello", "body": "Can you help?" },
"schema": {
"type": "object",
"properties": {
"intent": { "type": "string" },
"draft": { "type": "string" }
},
"required": ["intent", "draft"],
"additionalProperties": false
}
}'
```
See [LLM Task](/tools/llm-task) for details and configuration options.
## Workflow files (.lobster)
Lobster can run YAML/JSON workflow files with `name`, `args`, `steps`, `env`, `condition`, and `approval` fields. In Moltbot tool calls, set `pipeline` to the file path.
```yaml
name: inbox-triage
args:
tag:
default: "family"
steps:
- id: collect
command: inbox list --json
- id: categorize
command: inbox categorize --json
stdin: $collect.stdout
- id: approve
command: inbox apply --approve
stdin: $categorize.stdout
approval: required
- id: execute
command: inbox apply --execute
stdin: $categorize.stdout
condition: $approve.approved
```
Notes:
- `stdin: $step.stdout` and `stdin: $step.json` pass a prior steps output.
- `condition` (or `when`) can gate steps on `$step.approved`.
## Install Lobster
Install the Lobster CLI on the **same host** that runs the Moltbot Gateway (see the [Lobster repo](https://github.com/moltbot/lobster)), and ensure `lobster` is on `PATH`.
If you want to use a custom binary location, pass an **absolute** `lobsterPath` in the tool call.
## Enable the tool
Lobster is an **optional** plugin tool (not enabled by default).
Recommended (additive, safe):
```json
{
"tools": {
"alsoAllow": ["lobster"]
}
}
```
Or per-agent:
```json
{
"agents": {
"list": [
{
"id": "main",
"tools": {
"alsoAllow": ["lobster"]
}
}
]
}
}
```
Avoid using `tools.allow: ["lobster"]` unless you intend to run in restrictive allowlist mode.
Note: allowlists are opt-in for optional plugins. If your allowlist only names
plugin tools (like `lobster`), Moltbot keeps core tools enabled. To restrict core
tools, include the core tools or groups you want in the allowlist too.
## Example: Email triage
Without Lobster:
```
User: "Check my email and draft replies"
→ clawd calls gmail.list
→ LLM summarizes
→ User: "draft replies to #2 and #5"
→ LLM drafts
→ User: "send #2"
→ clawd calls gmail.send
(repeat daily, no memory of what was triaged)
```
With Lobster:
```json
{
"action": "run",
"pipeline": "email.triage --limit 20",
"timeoutMs": 30000
}
```
Returns a JSON envelope (truncated):
```json
{
"ok": true,
"status": "needs_approval",
"output": [{ "summary": "5 need replies, 2 need action" }],
"requiresApproval": {
"type": "approval_request",
"prompt": "Send 2 draft replies?",
"items": [],
"resumeToken": "..."
}
}
```
User approves → resume:
```json
{
"action": "resume",
"token": "<resumeToken>",
"approve": true
}
```
One workflow. Deterministic. Safe.
## Tool parameters
### `run`
Run a pipeline in tool mode.
```json
{
"action": "run",
"pipeline": "gog.gmail.search --query 'newer_than:1d' | email.triage",
"cwd": "/path/to/workspace",
"timeoutMs": 30000,
"maxStdoutBytes": 512000
}
```
Run a workflow file with args:
```json
{
"action": "run",
"pipeline": "/path/to/inbox-triage.lobster",
"argsJson": "{\"tag\":\"family\"}"
}
```
### `resume`
Continue a halted workflow after approval.
```json
{
"action": "resume",
"token": "<resumeToken>",
"approve": true
}
```
### Optional inputs
- `lobsterPath`: Absolute path to the Lobster binary (omit to use `PATH`).
- `cwd`: Working directory for the pipeline (defaults to the current process working directory).
- `timeoutMs`: Kill the subprocess if it exceeds this duration (default: 20000).
- `maxStdoutBytes`: Kill the subprocess if stdout exceeds this size (default: 512000).
- `argsJson`: JSON string passed to `lobster run --args-json` (workflow files only).
## Output envelope
Lobster returns a JSON envelope with one of three statuses:
- `ok` → finished successfully
- `needs_approval` → paused; `requiresApproval.resumeToken` is required to resume
- `cancelled` → explicitly denied or cancelled
The tool surfaces the envelope in both `content` (pretty JSON) and `details` (raw object).
## Approvals
If `requiresApproval` is present, inspect the prompt and decide:
- `approve: true` → resume and continue side effects
- `approve: false` → cancel and finalize the workflow
Use `approve --preview-from-stdin --limit N` to attach a JSON preview to approval requests without custom jq/heredoc glue. Resume tokens are now compact: Lobster stores workflow resume state under its state dir and hands back a small token key.
## OpenProse
OpenProse pairs well with Lobster: use `/prose` to orchestrate multi-agent prep, then run a Lobster pipeline for deterministic approvals. If a Prose program needs Lobster, allow the `lobster` tool for sub-agents via `tools.subagents.tools`. See [OpenProse](/prose).
## Safety
- **Local subprocess only** — no network calls from the plugin itself.
- **No secrets** — Lobster doesn't manage OAuth; it calls clawd tools that do.
- **Sandbox-aware** — disabled when the tool context is sandboxed.
- **Hardened** — `lobsterPath` must be absolute if specified; timeouts and output caps enforced.
## Troubleshooting
- **`lobster subprocess timed out`** → increase `timeoutMs`, or split a long pipeline.
- **`lobster output exceeded maxStdoutBytes`** → raise `maxStdoutBytes` or reduce output size.
- **`lobster returned invalid JSON`** → ensure the pipeline runs in tool mode and prints only JSON.
- **`lobster failed (code …)`** → run the same pipeline in a terminal to inspect stderr.
## Learn more
- [Plugins](/plugin)
- [Plugin tool authoring](/plugins/agent-tools)
## Case study: community workflows
One public example: a “second brain” CLI + Lobster pipelines that manage three Markdown vaults (personal, partner, shared). The CLI emits JSON for stats, inbox listings, and stale scans; Lobster chains those commands into workflows like `weekly-review`, `inbox-triage`, `memory-consolidation`, and `shared-task-sync`, each with approval gates. AI handles judgment (categorization) when available and falls back to deterministic rules when not.
- Thread: https://x.com/plattenschieber/status/2014508656335770033
- Repo: https://github.com/bloomedai/brain-cli

View File

@@ -0,0 +1,20 @@
---
summary: "Reaction semantics shared across channels"
read_when:
- Working on reactions in any channel
---
# Reaction tooling
Shared reaction semantics across channels:
- `emoji` is required when adding a reaction.
- `emoji=""` removes the bot's reaction(s) when supported.
- `remove: true` removes the specified emoji when supported (requires `emoji`).
Channel notes:
- **Discord/Slack**: empty `emoji` removes all of the bot's reactions on the message; `remove: true` removes just that emoji.
- **Google Chat**: empty `emoji` removes the app's reactions on the message; `remove: true` removes just that emoji.
- **Telegram**: empty `emoji` removes the bot's reactions; `remove: true` also removes reactions but still requires a non-empty `emoji` for tool validation.
- **WhatsApp**: empty `emoji` removes the bot reaction; `remove: true` maps to empty emoji (still requires `emoji`).
- **Signal**: inbound reaction notifications emit system events when `channels.signal.reactionNotifications` is enabled.

View File

@@ -0,0 +1,75 @@
---
summary: "Skills config schema and examples"
read_when:
- Adding or modifying skills config
- Adjusting bundled allowlist or install behavior
---
# Skills Config
All skills-related configuration lives under `skills` in `~/.clawdbot/moltbot.json`.
```json5
{
skills: {
allowBundled: ["gemini", "peekaboo"],
load: {
extraDirs: [
"~/Projects/agent-scripts/skills",
"~/Projects/oss/some-skill-pack/skills"
],
watch: true,
watchDebounceMs: 250
},
install: {
preferBrew: true,
nodeManager: "npm" // npm | pnpm | yarn | bun (Gateway runtime still Node; bun not recommended)
},
entries: {
"nano-banana-pro": {
enabled: true,
apiKey: "GEMINI_KEY_HERE",
env: {
GEMINI_API_KEY: "GEMINI_KEY_HERE"
}
},
peekaboo: { enabled: true },
sag: { enabled: false }
}
}
}
```
## Fields
- `allowBundled`: optional allowlist for **bundled** skills only. When set, only
bundled skills in the list are eligible (managed/workspace skills unaffected).
- `load.extraDirs`: additional skill directories to scan (lowest precedence).
- `load.watch`: watch skill folders and refresh the skills snapshot (default: true).
- `load.watchDebounceMs`: debounce for skill watcher events in milliseconds (default: 250).
- `install.preferBrew`: prefer brew installers when available (default: true).
- `install.nodeManager`: node installer preference (`npm` | `pnpm` | `yarn` | `bun`, default: npm).
This only affects **skill installs**; the Gateway runtime should still be Node
(Bun not recommended for WhatsApp/Telegram).
- `entries.<skillKey>`: per-skill overrides.
Per-skill fields:
- `enabled`: set `false` to disable a skill even if its bundled/installed.
- `env`: environment variables injected for the agent run (only if not already set).
- `apiKey`: optional convenience for skills that declare a primary env var.
## Notes
- Keys under `entries` map to the skill name by default. If a skill defines
`metadata.clawdbot.skillKey`, use that key instead.
- Changes to skills are picked up on the next agent turn when the watcher is enabled.
### Sandboxed skills + env vars
When a session is **sandboxed**, skill processes run inside Docker. The sandbox
does **not** inherit the host `process.env`.
Use one of:
- `agents.defaults.sandbox.docker.env` (or per-agent `agents.list[].sandbox.docker.env`)
- bake the env into your custom sandbox image
Global `env` and `skills.entries.<skill>.env/apiKey` apply to **host** runs only.

View File

@@ -0,0 +1,267 @@
---
summary: "Skills: managed vs workspace, gating rules, and config/env wiring"
read_when:
- Adding or modifying skills
- Changing skill gating or load rules
---
# Skills (Moltbot)
Moltbot uses **[AgentSkills](https://agentskills.io)-compatible** skill folders to teach the agent how to use tools. Each skill is a directory containing a `SKILL.md` with YAML frontmatter and instructions. Moltbot loads **bundled skills** plus optional local overrides, and filters them at load time based on environment, config, and binary presence.
## Locations and precedence
Skills are loaded from **three** places:
1) **Bundled skills**: shipped with the install (npm package or Moltbot.app)
2) **Managed/local skills**: `~/.clawdbot/skills`
3) **Workspace skills**: `<workspace>/skills`
If a skill name conflicts, precedence is:
`<workspace>/skills` (highest) → `~/.clawdbot/skills` → bundled skills (lowest)
Additionally, you can configure extra skill folders (lowest precedence) via
`skills.load.extraDirs` in `~/.clawdbot/moltbot.json`.
## Per-agent vs shared skills
In **multi-agent** setups, each agent has its own workspace. That means:
- **Per-agent skills** live in `<workspace>/skills` for that agent only.
- **Shared skills** live in `~/.clawdbot/skills` (managed/local) and are visible
to **all agents** on the same machine.
- **Shared folders** can also be added via `skills.load.extraDirs` (lowest
precedence) if you want a common skills pack used by multiple agents.
If the same skill name exists in more than one place, the usual precedence
applies: workspace wins, then managed/local, then bundled.
## Plugins + skills
Plugins can ship their own skills by listing `skills` directories in
`moltbot.plugin.json` (paths relative to the plugin root). Plugin skills load
when the plugin is enabled and participate in the normal skill precedence rules.
You can gate them via `metadata.clawdbot.requires.config` on the plugins config
entry. See [Plugins](/plugin) for discovery/config and [Tools](/tools) for the
tool surface those skills teach.
## ClawdHub (install + sync)
ClawdHub is the public skills registry for Moltbot. Browse at
https://clawdhub.com. Use it to discover, install, update, and back up skills.
Full guide: [ClawdHub](/tools/clawdhub).
Common flows:
- Install a skill into your workspace:
- `clawdhub install <skill-slug>`
- Update all installed skills:
- `clawdhub update --all`
- Sync (scan + publish updates):
- `clawdhub sync --all`
By default, `clawdhub` installs into `./skills` under your current working
directory (or falls back to the configured Moltbot workspace). Moltbot picks
that up as `<workspace>/skills` on the next session.
## Security notes
- Treat third-party skills as **trusted code**. Read them before enabling.
- Prefer sandboxed runs for untrusted inputs and risky tools. See [Sandboxing](/gateway/sandboxing).
- `skills.entries.*.env` and `skills.entries.*.apiKey` inject secrets into the **host** process
for that agent turn (not the sandbox). Keep secrets out of prompts and logs.
- For a broader threat model and checklists, see [Security](/gateway/security).
## Format (AgentSkills + Pi-compatible)
`SKILL.md` must include at least:
```markdown
---
name: nano-banana-pro
description: Generate or edit images via Gemini 3 Pro Image
---
```
Notes:
- We follow the AgentSkills spec for layout/intent.
- The parser used by the embedded agent supports **single-line** frontmatter keys only.
- `metadata` should be a **single-line JSON object**.
- Use `{baseDir}` in instructions to reference the skill folder path.
- Optional frontmatter keys:
- `homepage` — URL surfaced as “Website” in the macOS Skills UI (also supported via `metadata.clawdbot.homepage`).
- `user-invocable``true|false` (default: `true`). When `true`, the skill is exposed as a user slash command.
- `disable-model-invocation``true|false` (default: `false`). When `true`, the skill is excluded from the model prompt (still available via user invocation).
- `command-dispatch``tool` (optional). When set to `tool`, the slash command bypasses the model and dispatches directly to a tool.
- `command-tool` — tool name to invoke when `command-dispatch: tool` is set.
- `command-arg-mode``raw` (default). For tool dispatch, forwards the raw args string to the tool (no core parsing).
The tool is invoked with params:
`{ command: "<raw args>", commandName: "<slash command>", skillName: "<skill name>" }`.
## Gating (load-time filters)
Moltbot **filters skills at load time** using `metadata` (single-line JSON):
```markdown
---
name: nano-banana-pro
description: Generate or edit images via Gemini 3 Pro Image
metadata: {"moltbot":{"requires":{"bins":["uv"],"env":["GEMINI_API_KEY"],"config":["browser.enabled"]},"primaryEnv":"GEMINI_API_KEY"}}
---
```
Fields under `metadata.clawdbot`:
- `always: true` — always include the skill (skip other gates).
- `emoji` — optional emoji used by the macOS Skills UI.
- `homepage` — optional URL shown as “Website” in the macOS Skills UI.
- `os` — optional list of platforms (`darwin`, `linux`, `win32`). If set, the skill is only eligible on those OSes.
- `requires.bins` — list; each must exist on `PATH`.
- `requires.anyBins` — list; at least one must exist on `PATH`.
- `requires.env` — list; env var must exist **or** be provided in config.
- `requires.config` — list of `moltbot.json` paths that must be truthy.
- `primaryEnv` — env var name associated with `skills.entries.<name>.apiKey`.
- `install` — optional array of installer specs used by the macOS Skills UI (brew/node/go/uv/download).
Note on sandboxing:
- `requires.bins` is checked on the **host** at skill load time.
- If an agent is sandboxed, the binary must also exist **inside the container**.
Install it via `agents.defaults.sandbox.docker.setupCommand` (or a custom image).
`setupCommand` runs once after the container is created.
Package installs also require network egress, a writable root FS, and a root user in the sandbox.
Example: the `summarize` skill (`skills/summarize/SKILL.md`) needs the `summarize` CLI
in the sandbox container to run there.
Installer example:
```markdown
---
name: gemini
description: Use Gemini CLI for coding assistance and Google search lookups.
metadata: {"moltbot":{"emoji":"♊️","requires":{"bins":["gemini"]},"install":[{"id":"brew","kind":"brew","formula":"gemini-cli","bins":["gemini"],"label":"Install Gemini CLI (brew)"}]}}
---
```
Notes:
- If multiple installers are listed, the gateway picks a **single** preferred option (brew when available, otherwise node).
- If all installers are `download`, Moltbot lists each entry so you can see the available artifacts.
- Installer specs can include `os: ["darwin"|"linux"|"win32"]` to filter options by platform.
- Node installs honor `skills.install.nodeManager` in `moltbot.json` (default: npm; options: npm/pnpm/yarn/bun).
This only affects **skill installs**; the Gateway runtime should still be Node
(Bun is not recommended for WhatsApp/Telegram).
- Go installs: if `go` is missing and `brew` is available, the gateway installs Go via Homebrew first and sets `GOBIN` to Homebrews `bin` when possible.
- Download installs: `url` (required), `archive` (`tar.gz` | `tar.bz2` | `zip`), `extract` (default: auto when archive detected), `stripComponents`, `targetDir` (default: `~/.clawdbot/tools/<skillKey>`).
If no `metadata.clawdbot` is present, the skill is always eligible (unless
disabled in config or blocked by `skills.allowBundled` for bundled skills).
## Config overrides (`~/.clawdbot/moltbot.json`)
Bundled/managed skills can be toggled and supplied with env values:
```json5
{
skills: {
entries: {
"nano-banana-pro": {
enabled: true,
apiKey: "GEMINI_KEY_HERE",
env: {
GEMINI_API_KEY: "GEMINI_KEY_HERE"
},
config: {
endpoint: "https://example.invalid",
model: "nano-pro"
}
},
peekaboo: { enabled: true },
sag: { enabled: false }
}
}
}
```
Note: if the skill name contains hyphens, quote the key (JSON5 allows quoted keys).
Config keys match the **skill name** by default. If a skill defines
`metadata.clawdbot.skillKey`, use that key under `skills.entries`.
Rules:
- `enabled: false` disables the skill even if its bundled/installed.
- `env`: injected **only if** the variable isnt already set in the process.
- `apiKey`: convenience for skills that declare `metadata.clawdbot.primaryEnv`.
- `config`: optional bag for custom per-skill fields; custom keys must live here.
- `allowBundled`: optional allowlist for **bundled** skills only. If set, only
bundled skills in the list are eligible (managed/workspace skills unaffected).
## Environment injection (per agent run)
When an agent run starts, Moltbot:
1) Reads skill metadata.
2) Applies any `skills.entries.<key>.env` or `skills.entries.<key>.apiKey` to
`process.env`.
3) Builds the system prompt with **eligible** skills.
4) Restores the original environment after the run ends.
This is **scoped to the agent run**, not a global shell environment.
## Session snapshot (performance)
Moltbot snapshots the eligible skills **when a session starts** and reuses that list for subsequent turns in the same session. Changes to skills or config take effect on the next new session.
Skills can also refresh mid-session when the skills watcher is enabled or when a new eligible remote node appears (see below). Think of this as a **hot reload**: the refreshed list is picked up on the next agent turn.
## Remote macOS nodes (Linux gateway)
If the Gateway is running on Linux but a **macOS node** is connected **with `system.run` allowed** (Exec approvals security not set to `deny`), Moltbot can treat macOS-only skills as eligible when the required binaries are present on that node. The agent should execute those skills via the `nodes` tool (typically `nodes.run`).
This relies on the node reporting its command support and on a bin probe via `system.run`. If the macOS node goes offline later, the skills remain visible; invocations may fail until the node reconnects.
## Skills watcher (auto-refresh)
By default, Moltbot watches skill folders and bumps the skills snapshot when `SKILL.md` files change. Configure this under `skills.load`:
```json5
{
skills: {
load: {
watch: true,
watchDebounceMs: 250
}
}
}
```
## Token impact (skills list)
When skills are eligible, Moltbot injects a compact XML list of available skills into the system prompt (via `formatSkillsForPrompt` in `pi-coding-agent`). The cost is deterministic:
- **Base overhead (only when ≥1 skill):** 195 characters.
- **Per skill:** 97 characters + the length of the XML-escaped `<name>`, `<description>`, and `<location>` values.
Formula (characters):
```
total = 195 + Σ (97 + len(name_escaped) + len(description_escaped) + len(location_escaped))
```
Notes:
- XML escaping expands `& < > " '` into entities (`&amp;`, `&lt;`, etc.), increasing length.
- Token counts vary by model tokenizer. A rough OpenAI-style estimate is ~4 chars/token, so **97 chars ≈ 24 tokens** per skill plus your actual field lengths.
## Managed skills lifecycle
Moltbot ships a baseline set of skills as **bundled skills** as part of the
install (npm package or Moltbot.app). `~/.clawdbot/skills` exists for local
overrides (for example, pinning/patching a skill without changing the bundled
copy). Workspace skills are user-owned and override both on name conflicts.
## Config reference
See [Skills config](/tools/skills-config) for the full configuration schema.
## Looking for more skills?
Browse https://clawdhub.com.
---

View File

@@ -0,0 +1,190 @@
---
summary: "Slash commands: text vs native, config, and supported commands"
read_when:
- Using or configuring chat commands
- Debugging command routing or permissions
---
# Slash commands
Commands are handled by the Gateway. Most commands must be sent as a **standalone** message that starts with `/`.
The host-only bash chat command uses `! <cmd>` (with `/bash <cmd>` as an alias).
There are two related systems:
- **Commands**: standalone `/...` messages.
- **Directives**: `/think`, `/verbose`, `/reasoning`, `/elevated`, `/exec`, `/model`, `/queue`.
- Directives are stripped from the message before the model sees it.
- In normal chat messages (not directive-only), they are treated as “inline hints” and do **not** persist session settings.
- In directive-only messages (the message contains only directives), they persist to the session and reply with an acknowledgement.
- Directives are only applied for **authorized senders** (channel allowlists/pairing plus `commands.useAccessGroups`).
Unauthorized senders see directives treated as plain text.
There are also a few **inline shortcuts** (allowlisted/authorized senders only): `/help`, `/commands`, `/status`, `/whoami` (`/id`).
They run immediately, are stripped before the model sees the message, and the remaining text continues through the normal flow.
## Config
```json5
{
commands: {
native: "auto",
nativeSkills: "auto",
text: true,
bash: false,
bashForegroundMs: 2000,
config: false,
debug: false,
restart: false,
useAccessGroups: true
}
}
```
- `commands.text` (default `true`) enables parsing `/...` in chat messages.
- On surfaces without native commands (WhatsApp/WebChat/Signal/iMessage/Google Chat/MS Teams), text commands still work even if you set this to `false`.
- `commands.native` (default `"auto"`) registers native commands.
- Auto: on for Discord/Telegram; off for Slack (until you add slash commands); ignored for providers without native support.
- Set `channels.discord.commands.native`, `channels.telegram.commands.native`, or `channels.slack.commands.native` to override per provider (bool or `"auto"`).
- `false` clears previously registered commands on Discord/Telegram at startup. Slack commands are managed in the Slack app and are not removed automatically.
- `commands.nativeSkills` (default `"auto"`) registers **skill** commands natively when supported.
- Auto: on for Discord/Telegram; off for Slack (Slack requires creating a slash command per skill).
- Set `channels.discord.commands.nativeSkills`, `channels.telegram.commands.nativeSkills`, or `channels.slack.commands.nativeSkills` to override per provider (bool or `"auto"`).
- `commands.bash` (default `false`) enables `! <cmd>` to run host shell commands (`/bash <cmd>` is an alias; requires `tools.elevated` allowlists).
- `commands.bashForegroundMs` (default `2000`) controls how long bash waits before switching to background mode (`0` backgrounds immediately).
- `commands.config` (default `false`) enables `/config` (reads/writes `moltbot.json`).
- `commands.debug` (default `false`) enables `/debug` (runtime-only overrides).
- `commands.useAccessGroups` (default `true`) enforces allowlists/policies for commands.
## Command list
Text + native (when enabled):
- `/help`
- `/commands`
- `/skill <name> [input]` (run a skill by name)
- `/status` (show current status; includes provider usage/quota for the current model provider when available)
- `/allowlist` (list/add/remove allowlist entries)
- `/approve <id> allow-once|allow-always|deny` (resolve exec approval prompts)
- `/context [list|detail|json]` (explain “context”; `detail` shows per-file + per-tool + per-skill + system prompt size)
- `/whoami` (show your sender id; alias: `/id`)
- `/subagents list|stop|log|info|send` (inspect, stop, log, or message sub-agent runs for the current session)
- `/config show|get|set|unset` (persist config to disk, owner-only; requires `commands.config: true`)
- `/debug show|set|unset|reset` (runtime overrides, owner-only; requires `commands.debug: true`)
- `/usage off|tokens|full|cost` (per-response usage footer or local cost summary)
- `/tts off|always|inbound|tagged|status|provider|limit|summary|audio` (control TTS; see [/tts](/tts))
- Discord: native command is `/voice` (Discord reserves `/tts`); text `/tts` still works.
- `/stop`
- `/restart`
- `/dock-telegram` (alias: `/dock_telegram`) (switch replies to Telegram)
- `/dock-discord` (alias: `/dock_discord`) (switch replies to Discord)
- `/dock-slack` (alias: `/dock_slack`) (switch replies to Slack)
- `/activation mention|always` (groups only)
- `/send on|off|inherit` (owner-only)
- `/reset` or `/new [model]` (optional model hint; remainder is passed through)
- `/think <off|minimal|low|medium|high|xhigh>` (dynamic choices by model/provider; aliases: `/thinking`, `/t`)
- `/verbose on|full|off` (alias: `/v`)
- `/reasoning on|off|stream` (alias: `/reason`; when on, sends a separate message prefixed `Reasoning:`; `stream` = Telegram draft only)
- `/elevated on|off|ask|full` (alias: `/elev`; `full` skips exec approvals)
- `/exec host=<sandbox|gateway|node> security=<deny|allowlist|full> ask=<off|on-miss|always> node=<id>` (send `/exec` to show current)
- `/model <name>` (alias: `/models`; or `/<alias>` from `agents.defaults.models.*.alias`)
- `/queue <mode>` (plus options like `debounce:2s cap:25 drop:summarize`; send `/queue` to see current settings)
- `/bash <command>` (host-only; alias for `! <command>`; requires `commands.bash: true` + `tools.elevated` allowlists)
Text-only:
- `/compact [instructions]` (see [/concepts/compaction](/concepts/compaction))
- `! <command>` (host-only; one at a time; use `!poll` + `!stop` for long-running jobs)
- `!poll` (check output / status; accepts optional `sessionId`; `/bash poll` also works)
- `!stop` (stop the running bash job; accepts optional `sessionId`; `/bash stop` also works)
Notes:
- Commands accept an optional `:` between the command and args (e.g. `/think: high`, `/send: on`, `/help:`).
- `/new <model>` accepts a model alias, `provider/model`, or a provider name (fuzzy match); if no match, the text is treated as the message body.
- For full provider usage breakdown, use `moltbot status --usage`.
- `/allowlist add|remove` requires `commands.config=true` and honors channel `configWrites`.
- `/usage` controls the per-response usage footer; `/usage cost` prints a local cost summary from Moltbot session logs.
- `/restart` is disabled by default; set `commands.restart: true` to enable it.
- `/verbose` is meant for debugging and extra visibility; keep it **off** in normal use.
- `/reasoning` (and `/verbose`) are risky in group settings: they may reveal internal reasoning or tool output you did not intend to expose. Prefer leaving them off, especially in group chats.
- **Fast path:** command-only messages from allowlisted senders are handled immediately (bypass queue + model).
- **Group mention gating:** command-only messages from allowlisted senders bypass mention requirements.
- **Inline shortcuts (allowlisted senders only):** certain commands also work when embedded in a normal message and are stripped before the model sees the remaining text.
- Example: `hey /status` triggers a status reply, and the remaining text continues through the normal flow.
- Currently: `/help`, `/commands`, `/status`, `/whoami` (`/id`).
- Unauthorized command-only messages are silently ignored, and inline `/...` tokens are treated as plain text.
- **Skill commands:** `user-invocable` skills are exposed as slash commands. Names are sanitized to `a-z0-9_` (max 32 chars); collisions get numeric suffixes (e.g. `_2`).
- `/skill <name> [input]` runs a skill by name (useful when native command limits prevent per-skill commands).
- By default, skill commands are forwarded to the model as a normal request.
- Skills may optionally declare `command-dispatch: tool` to route the command directly to a tool (deterministic, no model).
- Example: `/prose` (OpenProse plugin) — see [OpenProse](/prose).
- **Native command arguments:** Discord uses autocomplete for dynamic options (and button menus when you omit required args). Telegram and Slack show a button menu when a command supports choices and you omit the arg.
## Usage surfaces (what shows where)
- **Provider usage/quota** (example: “Claude 80% left”) shows up in `/status` for the current model provider when usage tracking is enabled.
- **Per-response tokens/cost** is controlled by `/usage off|tokens|full` (appended to normal replies).
- `/model status` is about **models/auth/endpoints**, not usage.
## Model selection (`/model`)
`/model` is implemented as a directive.
Examples:
```
/model
/model list
/model 3
/model openai/gpt-5.2
/model opus@anthropic:default
/model status
```
Notes:
- `/model` and `/model list` show a compact, numbered picker (model family + available providers).
- `/model <#>` selects from that picker (and prefers the current provider when possible).
- `/model status` shows the detailed view, including configured provider endpoint (`baseUrl`) and API mode (`api`) when available.
## Debug overrides
`/debug` lets you set **runtime-only** config overrides (memory, not disk). Owner-only. Disabled by default; enable with `commands.debug: true`.
Examples:
```
/debug show
/debug set messages.responsePrefix="[moltbot]"
/debug set channels.whatsapp.allowFrom=["+1555","+4477"]
/debug unset messages.responsePrefix
/debug reset
```
Notes:
- Overrides apply immediately to new config reads, but do **not** write to `moltbot.json`.
- Use `/debug reset` to clear all overrides and return to the on-disk config.
## Config updates
`/config` writes to your on-disk config (`moltbot.json`). Owner-only. Disabled by default; enable with `commands.config: true`.
Examples:
```
/config show
/config show messages.responsePrefix
/config get messages.responsePrefix
/config set messages.responsePrefix="[moltbot]"
/config unset messages.responsePrefix
```
Notes:
- Config is validated before write; invalid changes are rejected.
- `/config` updates persist across restarts.
## Surface notes
- **Text commands** run in the normal chat session (DMs share `main`, groups have their own session).
- **Native commands** use isolated sessions:
- Discord: `agent:<agentId>:discord:slash:<userId>`
- Slack: `agent:<agentId>:slack:slash:<userId>` (prefix configurable via `channels.slack.slashCommand.sessionPrefix`)
- Telegram: `telegram:slash:<userId>` (targets the chat session via `CommandTargetSessionKey`)
- **`/stop`** targets the active chat session so it can abort the current run.
- **Slack:** `channels.slack.slashCommand` is still supported for a single `/clawd`-style command. If you enable `commands.native`, you must create one Slack slash command per built-in command (same names as `/help`). Command argument menus for Slack are delivered as ephemeral Block Kit buttons.

View File

@@ -0,0 +1,137 @@
---
summary: "Sub-agents: spawning isolated agent runs that announce results back to the requester chat"
read_when:
- You want background/parallel work via the agent
- You are changing sessions_spawn or sub-agent tool policy
---
# Sub-agents
Sub-agents are background agent runs spawned from an existing agent run. They run in their own session (`agent:<agentId>:subagent:<uuid>`) and, when finished, **announce** their result back to the requester chat channel.
## Slash command
Use `/subagents` to inspect or control sub-agent runs for the **current session**:
- `/subagents list`
- `/subagents stop <id|#|all>`
- `/subagents log <id|#> [limit] [tools]`
- `/subagents info <id|#>`
- `/subagents send <id|#> <message>`
`/subagents info` shows run metadata (status, timestamps, session id, transcript path, cleanup).
Primary goals:
- Parallelize “research / long task / slow tool” work without blocking the main run.
- Keep sub-agents isolated by default (session separation + optional sandboxing).
- Keep the tool surface hard to misuse: sub-agents do **not** get session tools by default.
- Avoid nested fan-out: sub-agents cannot spawn sub-agents.
Cost note: each sub-agent has its **own** context and token usage. For heavy or repetitive
tasks, set a cheaper model for sub-agents and keep your main agent on a higher-quality model.
You can configure this via `agents.defaults.subagents.model` or per-agent overrides.
## Tool
Use `sessions_spawn`:
- Starts a sub-agent run (`deliver: false`, global lane: `subagent`)
- Then runs an announce step and posts the announce reply to the requester chat channel
- Default model: inherits the caller unless you set `agents.defaults.subagents.model` (or per-agent `agents.list[].subagents.model`); an explicit `sessions_spawn.model` still wins.
Tool params:
- `task` (required)
- `label?` (optional)
- `agentId?` (optional; spawn under another agent id if allowed)
- `model?` (optional; overrides the sub-agent model; invalid values are skipped and the sub-agent runs on the default model with a warning in the tool result)
- `thinking?` (optional; overrides thinking level for the sub-agent run)
- `runTimeoutSeconds?` (default `0`; when set, the sub-agent run is aborted after N seconds)
- `cleanup?` (`delete|keep`, default `keep`)
Allowlist:
- `agents.list[].subagents.allowAgents`: list of agent ids that can be targeted via `agentId` (`["*"]` to allow any). Default: only the requester agent.
Discovery:
- Use `agents_list` to see which agent ids are currently allowed for `sessions_spawn`.
Auto-archive:
- Sub-agent sessions are automatically archived after `agents.defaults.subagents.archiveAfterMinutes` (default: 60).
- Archive uses `sessions.delete` and renames the transcript to `*.deleted.<timestamp>` (same folder).
- `cleanup: "delete"` archives immediately after announce (still keeps the transcript via rename).
- Auto-archive is best-effort; pending timers are lost if the gateway restarts.
- `runTimeoutSeconds` does **not** auto-archive; it only stops the run. The session remains until auto-archive.
## Authentication
Sub-agent auth is resolved by **agent id**, not by session type:
- The sub-agent session key is `agent:<agentId>:subagent:<uuid>`.
- The auth store is loaded from that agents `agentDir`.
- The main agents auth profiles are merged in as a **fallback**; agent profiles override main profiles on conflicts.
Note: the merge is additive, so main profiles are always available as fallbacks. Fully isolated auth per agent is not supported yet.
## Announce
Sub-agents report back via an announce step:
- The announce step runs inside the sub-agent session (not the requester session).
- If the sub-agent replies exactly `ANNOUNCE_SKIP`, nothing is posted.
- Otherwise the announce reply is posted to the requester chat channel via a follow-up `agent` call (`deliver=true`).
- Announce replies preserve thread/topic routing when available (Slack threads, Telegram topics, Matrix threads).
- Announce messages are normalized to a stable template:
- `Status:` derived from the run outcome (`success`, `error`, `timeout`, or `unknown`).
- `Result:` the summary content from the announce step (or `(not available)` if missing).
- `Notes:` error details and other useful context.
- `Status` is not inferred from model output; it comes from runtime outcome signals.
Announce payloads include a stats line at the end (even when wrapped):
- Runtime (e.g., `runtime 5m12s`)
- Token usage (input/output/total)
- Estimated cost when model pricing is configured (`models.providers.*.models[].cost`)
- `sessionKey`, `sessionId`, and transcript path (so the main agent can fetch history via `sessions_history` or inspect the file on disk)
## Tool Policy (sub-agent tools)
By default, sub-agents get **all tools except session tools**:
- `sessions_list`
- `sessions_history`
- `sessions_send`
- `sessions_spawn`
Override via config:
```json5
{
agents: {
defaults: {
subagents: {
maxConcurrent: 1
}
}
},
tools: {
subagents: {
tools: {
// deny wins
deny: ["gateway", "cron"],
// if allow is set, it becomes allow-only (deny still wins)
// allow: ["read", "exec", "process"]
}
}
}
}
```
## Concurrency
Sub-agents use a dedicated in-process queue lane:
- Lane name: `subagent`
- Concurrency: `agents.defaults.subagents.maxConcurrent` (default `8`)
## Stopping
- Sending `/stop` in the requester chat aborts the requester session and stops any active sub-agent runs spawned from it.
## Limitations
- Sub-agent announce is **best-effort**. If the gateway restarts, pending “announce back” work is lost.
- Sub-agents still share the same gateway process resources; treat `maxConcurrent` as a safety valve.
- `sessions_spawn` is always non-blocking: it returns `{ status: "accepted", runId, childSessionKey }` immediately.
- Sub-agent context only injects `AGENTS.md` + `TOOLS.md` (no `SOUL.md`, `IDENTITY.md`, `USER.md`, `HEARTBEAT.md`, or `BOOTSTRAP.md`).

View File

@@ -0,0 +1,62 @@
---
summary: "Directive syntax for /think + /verbose and how they affect model reasoning"
read_when:
- Adjusting thinking or verbose directive parsing or defaults
---
# Thinking Levels (/think directives)
## What it does
- Inline directive in any inbound body: `/t <level>`, `/think:<level>`, or `/thinking <level>`.
- Levels (aliases): `off | minimal | low | medium | high | xhigh` (GPT-5.2 + Codex models only)
- minimal → “think”
- low → “think hard”
- medium → “think harder”
- high → “ultrathink” (max budget)
- xhigh → “ultrathink+” (GPT-5.2 + Codex models only)
- `highest`, `max` map to `high`.
- Provider notes:
- Z.AI (`zai/*`) only supports binary thinking (`on`/`off`). Any non-`off` level is treated as `on` (mapped to `low`).
## Resolution order
1. Inline directive on the message (applies only to that message).
2. Session override (set by sending a directive-only message).
3. Global default (`agents.defaults.thinkingDefault` in config).
4. Fallback: low for reasoning-capable models; off otherwise.
## Setting a session default
- Send a message that is **only** the directive (whitespace allowed), e.g. `/think:medium` or `/t high`.
- That sticks for the current session (per-sender by default); cleared by `/think:off` or session idle reset.
- Confirmation reply is sent (`Thinking level set to high.` / `Thinking disabled.`). If the level is invalid (e.g. `/thinking big`), the command is rejected with a hint and the session state is left unchanged.
- Send `/think` (or `/think:`) with no argument to see the current thinking level.
## Application by agent
- **Embedded Pi**: the resolved level is passed to the in-process Pi agent runtime.
## Verbose directives (/verbose or /v)
- Levels: `on` (minimal) | `full` | `off` (default).
- Directive-only message toggles session verbose and replies `Verbose logging enabled.` / `Verbose logging disabled.`; invalid levels return a hint without changing state.
- `/verbose off` stores an explicit session override; clear it via the Sessions UI by choosing `inherit`.
- Inline directive affects only that message; session/global defaults apply otherwise.
- Send `/verbose` (or `/verbose:`) with no argument to see the current verbose level.
- When verbose is on, agents that emit structured tool results (Pi, other JSON agents) send each tool call back as its own metadata-only message, prefixed with `<emoji> <tool-name>: <arg>` when available (path/command). These tool summaries are sent as soon as each tool starts (separate bubbles), not as streaming deltas.
- When verbose is `full`, tool outputs are also forwarded after completion (separate bubble, truncated to a safe length). If you toggle `/verbose on|full|off` while a run is in-flight, subsequent tool bubbles honor the new setting.
## Reasoning visibility (/reasoning)
- Levels: `on|off|stream`.
- Directive-only message toggles whether thinking blocks are shown in replies.
- When enabled, reasoning is sent as a **separate message** prefixed with `Reasoning:`.
- `stream` (Telegram only): streams reasoning into the Telegram draft bubble while the reply is generating, then sends the final answer without reasoning.
- Alias: `/reason`.
- Send `/reasoning` (or `/reasoning:`) with no argument to see the current reasoning level.
## Related
- Elevated mode docs live in [Elevated mode](/tools/elevated).
## Heartbeats
- Heartbeat probe body is the configured heartbeat prompt (default: `Read HEARTBEAT.md if it exists (workspace context). Follow it strictly. Do not infer or repeat old tasks from prior chats. If nothing needs attention, reply HEARTBEAT_OK.`). Inline directives in a heartbeat message apply as usual (but avoid changing session defaults from heartbeats).
- Heartbeat delivery defaults to the final payload only. To also send the separate `Reasoning:` message (when available), set `agents.defaults.heartbeat.includeReasoning: true` or per-agent `agents.list[].heartbeat.includeReasoning: true`.
## Web chat UI
- The web chat thinking selector mirrors the session's stored level from the inbound session store/config when the page loads.
- Picking another level applies only to the next message (`thinkingOnce`); after sending, the selector snaps back to the stored session level.
- To change the session default, send a `/think:<level>` directive (as before); the selector will reflect it after the next reload.

View File

@@ -0,0 +1,257 @@
---
summary: "Web search + fetch tools (Brave Search API, Perplexity direct/OpenRouter)"
read_when:
- You want to enable web_search or web_fetch
- You need Brave Search API key setup
- You want to use Perplexity Sonar for web search
---
# Web tools
Moltbot ships two lightweight web tools:
- `web_search` — Search the web via Brave Search API (default) or Perplexity Sonar (direct or via OpenRouter).
- `web_fetch` — HTTP fetch + readable extraction (HTML → markdown/text).
These are **not** browser automation. For JS-heavy sites or logins, use the
[Browser tool](/tools/browser).
## How it works
- `web_search` calls your configured provider and returns results.
- **Brave** (default): returns structured results (title, URL, snippet).
- **Perplexity**: returns AI-synthesized answers with citations from real-time web search.
- Results are cached by query for 15 minutes (configurable).
- `web_fetch` does a plain HTTP GET and extracts readable content
(HTML → markdown/text). It does **not** execute JavaScript.
- `web_fetch` is enabled by default (unless explicitly disabled).
## Choosing a search provider
| Provider | Pros | Cons | API Key |
|----------|------|------|---------|
| **Brave** (default) | Fast, structured results, free tier | Traditional search results | `BRAVE_API_KEY` |
| **Perplexity** | AI-synthesized answers, citations, real-time | Requires Perplexity or OpenRouter access | `OPENROUTER_API_KEY` or `PERPLEXITY_API_KEY` |
See [Brave Search setup](/brave-search) and [Perplexity Sonar](/perplexity) for provider-specific details.
Set the provider in config:
```json5
{
tools: {
web: {
search: {
provider: "brave" // or "perplexity"
}
}
}
}
```
Example: switch to Perplexity Sonar (direct API):
```json5
{
tools: {
web: {
search: {
provider: "perplexity",
perplexity: {
apiKey: "pplx-...",
baseUrl: "https://api.perplexity.ai",
model: "perplexity/sonar-pro"
}
}
}
}
}
```
## Getting a Brave API key
1) Create a Brave Search API account at https://brave.com/search/api/
2) In the dashboard, choose the **Data for Search** plan (not “Data for AI”) and generate an API key.
3) Run `moltbot configure --section web` to store the key in config (recommended), or set `BRAVE_API_KEY` in your environment.
Brave provides a free tier plus paid plans; check the Brave API portal for the
current limits and pricing.
### Where to set the key (recommended)
**Recommended:** run `moltbot configure --section web`. It stores the key in
`~/.clawdbot/moltbot.json` under `tools.web.search.apiKey`.
**Environment alternative:** set `BRAVE_API_KEY` in the Gateway process
environment. For a gateway install, put it in `~/.clawdbot/.env` (or your
service environment). See [Env vars](/help/faq#how-does-moltbot-load-environment-variables).
## Using Perplexity (direct or via OpenRouter)
Perplexity Sonar models have built-in web search capabilities and return AI-synthesized
answers with citations. You can use them via OpenRouter (no credit card required - supports
crypto/prepaid).
### Getting an OpenRouter API key
1) Create an account at https://openrouter.ai/
2) Add credits (supports crypto, prepaid, or credit card)
3) Generate an API key in your account settings
### Setting up Perplexity search
```json5
{
tools: {
web: {
search: {
enabled: true,
provider: "perplexity",
perplexity: {
// API key (optional if OPENROUTER_API_KEY or PERPLEXITY_API_KEY is set)
apiKey: "sk-or-v1-...",
// Base URL (key-aware default if omitted)
baseUrl: "https://openrouter.ai/api/v1",
// Model (defaults to perplexity/sonar-pro)
model: "perplexity/sonar-pro"
}
}
}
}
}
```
**Environment alternative:** set `OPENROUTER_API_KEY` or `PERPLEXITY_API_KEY` in the Gateway
environment. For a gateway install, put it in `~/.clawdbot/.env`.
If no base URL is set, Moltbot chooses a default based on the API key source:
- `PERPLEXITY_API_KEY` or `pplx-...``https://api.perplexity.ai`
- `OPENROUTER_API_KEY` or `sk-or-...``https://openrouter.ai/api/v1`
- Unknown key formats → OpenRouter (safe fallback)
### Available Perplexity models
| Model | Description | Best for |
|-------|-------------|----------|
| `perplexity/sonar` | Fast Q&A with web search | Quick lookups |
| `perplexity/sonar-pro` (default) | Multi-step reasoning with web search | Complex questions |
| `perplexity/sonar-reasoning-pro` | Chain-of-thought analysis | Deep research |
## web_search
Search the web using your configured provider.
### Requirements
- `tools.web.search.enabled` must not be `false` (default: enabled)
- API key for your chosen provider:
- **Brave**: `BRAVE_API_KEY` or `tools.web.search.apiKey`
- **Perplexity**: `OPENROUTER_API_KEY`, `PERPLEXITY_API_KEY`, or `tools.web.search.perplexity.apiKey`
### Config
```json5
{
tools: {
web: {
search: {
enabled: true,
apiKey: "BRAVE_API_KEY_HERE", // optional if BRAVE_API_KEY is set
maxResults: 5,
timeoutSeconds: 30,
cacheTtlMinutes: 15
}
}
}
}
```
### Tool parameters
- `query` (required)
- `count` (110; default from config)
- `country` (optional): 2-letter country code for region-specific results (e.g., "DE", "US", "ALL"). If omitted, Brave chooses its default region.
- `search_lang` (optional): ISO language code for search results (e.g., "de", "en", "fr")
- `ui_lang` (optional): ISO language code for UI elements
- `freshness` (optional, Brave only): filter by discovery time (`pd`, `pw`, `pm`, `py`, or `YYYY-MM-DDtoYYYY-MM-DD`)
**Examples:**
```javascript
// German-specific search
await web_search({
query: "TV online schauen",
count: 10,
country: "DE",
search_lang: "de"
});
// French search with French UI
await web_search({
query: "actualités",
country: "FR",
search_lang: "fr",
ui_lang: "fr"
});
// Recent results (past week)
await web_search({
query: "TMBG interview",
freshness: "pw"
});
```
## web_fetch
Fetch a URL and extract readable content.
### Requirements
- `tools.web.fetch.enabled` must not be `false` (default: enabled)
- Optional Firecrawl fallback: set `tools.web.fetch.firecrawl.apiKey` or `FIRECRAWL_API_KEY`.
### Config
```json5
{
tools: {
web: {
fetch: {
enabled: true,
maxChars: 50000,
timeoutSeconds: 30,
cacheTtlMinutes: 15,
maxRedirects: 3,
userAgent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_7_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/122.0.0.0 Safari/537.36",
readability: true,
firecrawl: {
enabled: true,
apiKey: "FIRECRAWL_API_KEY_HERE", // optional if FIRECRAWL_API_KEY is set
baseUrl: "https://api.firecrawl.dev",
onlyMainContent: true,
maxAgeMs: 86400000, // ms (1 day)
timeoutSeconds: 60
}
}
}
}
}
```
### Tool parameters
- `url` (required, http/https only)
- `extractMode` (`markdown` | `text`)
- `maxChars` (truncate long pages)
Notes:
- `web_fetch` uses Readability (main-content extraction) first, then Firecrawl (if configured). If both fail, the tool returns an error.
- Firecrawl requests use bot-circumvention mode and cache results by default.
- `web_fetch` sends a Chrome-like User-Agent and `Accept-Language` by default; override `userAgent` if needed.
- `web_fetch` blocks private/internal hostnames and re-checks redirects (limit with `maxRedirects`).
- `web_fetch` is best-effort extraction; some sites will need the browser tool.
- See [Firecrawl](/tools/firecrawl) for key setup and service details.
- Responses are cached (default 15 minutes) to reduce repeated fetches.
- If you use tool profiles/allowlists, add `web_search`/`web_fetch` or `group:web`.
- If the Brave key is missing, `web_search` returns a short setup hint with a docs link.