Add ez-assistant and kerberos service folders

2026-02-11 14:56:03 -05:00
parent e4e8ae1b87
commit 9ccfb36923
4471 changed files with 746463 additions and 0 deletions
--- a/docker-compose/ez-assistant/docs/providers/anthropic.md
+++ b/docker-compose/ez-assistant/docs/providers/anthropic.md
@@ -0,0 +1,126 @@
+---
+summary: "Use Anthropic Claude via API keys or setup-token in Moltbot"
+read_when:
+  - You want to use Anthropic models in Moltbot
+  - You want setup-token instead of API keys
+---
+# Anthropic (Claude)
+
+Anthropic builds the **Claude** model family and provides access via an API.
+In Moltbot you can authenticate with an API key or a **setup-token**.
+
+## Option A: Anthropic API key
+
+**Best for:** standard API access and usage-based billing.
+Create your API key in the Anthropic Console.
+
+### CLI setup
+
+```bash
+moltbot onboard
+# choose: Anthropic API key
+
+# or non-interactive
+moltbot onboard --anthropic-api-key "$ANTHROPIC_API_KEY"
+```
+
+### Config snippet
+
+```json5
+{
+  env: { ANTHROPIC_API_KEY: "sk-ant-..." },
+  agents: { defaults: { model: { primary: "anthropic/claude-opus-4-5" } } }
+}
+```
+
+## Prompt caching (Anthropic API)
+
+Moltbot does **not** override Anthropic’s default cache TTL unless you set it.
+This is **API-only**; subscription auth does not honor TTL settings.
+
+To set the TTL per model, use `cacheControlTtl` in the model `params`:
+
+```json5
+{
+  agents: {
+    defaults: {
+      models: {
+        "anthropic/claude-opus-4-5": {
+          params: { cacheControlTtl: "5m" } // or "1h"
+        }
+      }
+    }
+  }
+}
+```
+
+Moltbot includes the `extended-cache-ttl-2025-04-11` beta flag for Anthropic API
+requests; keep it if you override provider headers (see [/gateway/configuration](/gateway/configuration)).
+
+## Option B: Claude setup-token
+
+**Best for:** using your Claude subscription.
+
+### Where to get a setup-token
+
+Setup-tokens are created by the **Claude Code CLI**, not the Anthropic Console. You can run this on **any machine**:
+
+```bash
+claude setup-token
+```
+
+Paste the token into Moltbot (wizard: **Anthropic token (paste setup-token)**), or run it on the gateway host:
+
+```bash
+moltbot models auth setup-token --provider anthropic
+```
+
+If you generated the token on a different machine, paste it:
+
+```bash
+moltbot models auth paste-token --provider anthropic
+```
+
+### CLI setup
+
+```bash
+# Paste a setup-token during onboarding
+moltbot onboard --auth-choice setup-token
+```
+
+### Config snippet
+
+```json5
+{
+  agents: { defaults: { model: { primary: "anthropic/claude-opus-4-5" } } }
+}
+```
+
+## Notes
+
+- Generate the setup-token with `claude setup-token` and paste it, or run `moltbot models auth setup-token` on the gateway host.
+- If you see “OAuth token refresh failed …” on a Claude subscription, re-auth with a setup-token. See [/gateway/troubleshooting#oauth-token-refresh-failed-anthropic-claude-subscription](/gateway/troubleshooting#oauth-token-refresh-failed-anthropic-claude-subscription).
+- Auth details + reuse rules are in [/concepts/oauth](/concepts/oauth).
+
+## Troubleshooting
+
+**401 errors / token suddenly invalid**
+- Claude subscription auth can expire or be revoked. Re-run `claude setup-token`
+  and paste it into the **gateway host**.
+- If the Claude CLI login lives on a different machine, use
+  `moltbot models auth paste-token --provider anthropic` on the gateway host.
+
+**No API key found for provider "anthropic"**
+- Auth is **per agent**. New agents don’t inherit the main agent’s keys.
+- Re-run onboarding for that agent, or paste a setup-token / API key on the
+  gateway host, then verify with `moltbot models status`.
+
+**No credentials found for profile `anthropic:default`**
+- Run `moltbot models status` to see which auth profile is active.
+- Re-run onboarding, or paste a setup-token / API key for that profile.
+
+**No available auth profile (all in cooldown/unavailable)**
+- Check `moltbot models status --json` for `auth.unusableProfiles`.
+- Add another Anthropic profile or wait for cooldown.
+
+More: [/gateway/troubleshooting](/gateway/troubleshooting) and [/help/faq](/help/faq).
--- a/docker-compose/ez-assistant/docs/providers/claude-max-api-proxy.md
+++ b/docker-compose/ez-assistant/docs/providers/claude-max-api-proxy.md
@@ -0,0 +1,145 @@
+---
+summary: "Use Claude Max/Pro subscription as an OpenAI-compatible API endpoint"
+read_when:
+  - You want to use Claude Max subscription with OpenAI-compatible tools
+  - You want a local API server that wraps Claude Code CLI
+  - You want to save money by using subscription instead of API keys
+---
+# Claude Max API Proxy
+
+**claude-max-api-proxy** is a community tool that exposes your Claude Max/Pro subscription as an OpenAI-compatible API endpoint. This allows you to use your subscription with any tool that supports the OpenAI API format.
+
+## Why Use This?
+
+| Approach | Cost | Best For |
+|----------|------|----------|
+| Anthropic API | Pay per token (~$15/M input, $75/M output for Opus) | Production apps, high volume |
+| Claude Max subscription | $200/month flat | Personal use, development, unlimited usage |
+
+If you have a Claude Max subscription and want to use it with OpenAI-compatible tools, this proxy can save you significant money.
+
+## How It Works
+
+```
+Your App → claude-max-api-proxy → Claude Code CLI → Anthropic (via subscription)
+     (OpenAI format)              (converts format)      (uses your login)
+```
+
+The proxy:
+1. Accepts OpenAI-format requests at `http://localhost:3456/v1/chat/completions`
+2. Converts them to Claude Code CLI commands
+3. Returns responses in OpenAI format (streaming supported)
+
+## Installation
+
+```bash
+# Requires Node.js 20+ and Claude Code CLI
+npm install -g claude-max-api-proxy
+
+# Verify Claude CLI is authenticated
+claude --version
+```
+
+## Usage
+
+### Start the server
+
+```bash
+claude-max-api
+# Server runs at http://localhost:3456
+```
+
+### Test it
+
+```bash
+# Health check
+curl http://localhost:3456/health
+
+# List models
+curl http://localhost:3456/v1/models
+
+# Chat completion
+curl http://localhost:3456/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -d '{
+    "model": "claude-opus-4",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
+```
+
+### With Moltbot
+
+You can point Moltbot at the proxy as a custom OpenAI-compatible endpoint:
+
+```json5
+{
+  env: {
+    OPENAI_API_KEY: "not-needed",
+    OPENAI_BASE_URL: "http://localhost:3456/v1"
+  },
+  agents: {
+    defaults: {
+      model: { primary: "openai/claude-opus-4" }
+    }
+  }
+}
+```
+
+## Available Models
+
+| Model ID | Maps To |
+|----------|---------|
+| `claude-opus-4` | Claude Opus 4 |
+| `claude-sonnet-4` | Claude Sonnet 4 |
+| `claude-haiku-4` | Claude Haiku 4 |
+
+## Auto-Start on macOS
+
+Create a LaunchAgent to run the proxy automatically:
+
+```bash
+cat > ~/Library/LaunchAgents/com.claude-max-api.plist << 'EOF'
+<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
+<plist version="1.0">
+<dict>
+  <key>Label</key>
+  <string>com.claude-max-api</string>
+  <key>RunAtLoad</key>
+  <true/>
+  <key>KeepAlive</key>
+  <true/>
+  <key>ProgramArguments</key>
+  <array>
+    <string>/usr/local/bin/node</string>
+    <string>/usr/local/lib/node_modules/claude-max-api-proxy/dist/server/standalone.js</string>
+  </array>
+  <key>EnvironmentVariables</key>
+  <dict>
+    <key>PATH</key>
+    <string>/usr/local/bin:/opt/homebrew/bin:~/.local/bin:/usr/bin:/bin</string>
+  </dict>
+</dict>
+</plist>
+EOF
+
+launchctl bootstrap gui/$(id -u) ~/Library/LaunchAgents/com.claude-max-api.plist
+```
+
+## Links
+
+- **npm:** https://www.npmjs.com/package/claude-max-api-proxy
+- **GitHub:** https://github.com/atalovesyou/claude-max-api-proxy
+- **Issues:** https://github.com/atalovesyou/claude-max-api-proxy/issues
+
+## Notes
+
+- This is a **community tool**, not officially supported by Anthropic or Moltbot
+- Requires an active Claude Max/Pro subscription with Claude Code CLI authenticated
+- The proxy runs locally and does not send data to any third-party servers
+- Streaming responses are fully supported
+
+## See Also
+
+- [Anthropic provider](/providers/anthropic) - Native Moltbot integration with Claude setup-token or API keys
+- [OpenAI provider](/providers/openai) - For OpenAI/Codex subscriptions
--- a/docker-compose/ez-assistant/docs/providers/deepgram.md
+++ b/docker-compose/ez-assistant/docs/providers/deepgram.md
@@ -0,0 +1,89 @@
+---
+summary: "Deepgram transcription for inbound voice notes"
+read_when:
+  - You want Deepgram speech-to-text for audio attachments
+  - You need a quick Deepgram config example
+---
+# Deepgram (Audio Transcription)
+
+Deepgram is a speech-to-text API. In Moltbot it is used for **inbound audio/voice note
+transcription** via `tools.media.audio`.
+
+When enabled, Moltbot uploads the audio file to Deepgram and injects the transcript
+into the reply pipeline (`{{Transcript}}` + `[Audio]` block). This is **not streaming**;
+it uses the pre-recorded transcription endpoint.
+
+Website: https://deepgram.com  
+Docs: https://developers.deepgram.com
+
+## Quick start
+
+1) Set your API key:
+```
+DEEPGRAM_API_KEY=dg_...
+```
+
+2) Enable the provider:
+```json5
+{
+  tools: {
+    media: {
+      audio: {
+        enabled: true,
+        models: [{ provider: "deepgram", model: "nova-3" }]
+      }
+    }
+  }
+}
+```
+
+## Options
+
+- `model`: Deepgram model id (default: `nova-3`)
+- `language`: language hint (optional)
+- `tools.media.audio.providerOptions.deepgram.detect_language`: enable language detection (optional)
+- `tools.media.audio.providerOptions.deepgram.punctuate`: enable punctuation (optional)
+- `tools.media.audio.providerOptions.deepgram.smart_format`: enable smart formatting (optional)
+
+Example with language:
+```json5
+{
+  tools: {
+    media: {
+      audio: {
+        enabled: true,
+        models: [
+          { provider: "deepgram", model: "nova-3", language: "en" }
+        ]
+      }
+    }
+  }
+}
+```
+
+Example with Deepgram options:
+```json5
+{
+  tools: {
+    media: {
+      audio: {
+        enabled: true,
+        providerOptions: {
+          deepgram: {
+            detect_language: true,
+            punctuate: true,
+            smart_format: true
+          }
+        },
+        models: [{ provider: "deepgram", model: "nova-3" }]
+      }
+    }
+  }
+}
+```
+
+## Notes
+
+- Authentication follows the standard provider auth order; `DEEPGRAM_API_KEY` is the simplest path.
+- Override endpoints or headers with `tools.media.audio.baseUrl` and `tools.media.audio.headers` when using a proxy.
+- Output follows the same audio rules as other providers (size caps, timeouts, transcript injection).
--- a/docker-compose/ez-assistant/docs/providers/github-copilot.md
+++ b/docker-compose/ez-assistant/docs/providers/github-copilot.md
@@ -0,0 +1,70 @@
+---
+summary: "Sign in to GitHub Copilot from Moltbot using the device flow"
+read_when:
+  - You want to use GitHub Copilot as a model provider
+  - You need the `moltbot models auth login-github-copilot` flow
+---
+# Github Copilot
+
+## What is GitHub Copilot?
+
+GitHub Copilot is GitHub's AI coding assistant. It provides access to Copilot
+models for your GitHub account and plan. Moltbot can use Copilot as a model
+provider in two different ways.
+
+## Two ways to use Copilot in Moltbot
+
+### 1) Built-in GitHub Copilot provider (`github-copilot`)
+
+Use the native device-login flow to obtain a GitHub token, then exchange it for
+Copilot API tokens when Moltbot runs. This is the **default** and simplest path
+because it does not require VS Code.
+
+### 2) Copilot Proxy plugin (`copilot-proxy`)
+
+Use the **Copilot Proxy** VS Code extension as a local bridge. Moltbot talks to
+the proxy’s `/v1` endpoint and uses the model list you configure there. Choose
+this when you already run Copilot Proxy in VS Code or need to route through it.
+You must enable the plugin and keep the VS Code extension running.
+
+Use GitHub Copilot as a model provider (`github-copilot`). The login command runs
+the GitHub device flow, saves an auth profile, and updates your config to use that
+profile.
+
+## CLI setup
+
+```bash
+moltbot models auth login-github-copilot
+```
+
+You'll be prompted to visit a URL and enter a one-time code. Keep the terminal
+open until it completes.
+
+### Optional flags
+
+```bash
+moltbot models auth login-github-copilot --profile-id github-copilot:work
+moltbot models auth login-github-copilot --yes
+```
+
+## Set a default model
+
+```bash
+moltbot models set github-copilot/gpt-4o
+```
+
+### Config snippet
+
+```json5
+{
+  agents: { defaults: { model: { primary: "github-copilot/gpt-4o" } } }
+}
+```
+
+## Notes
+
+- Requires an interactive TTY; run it directly in a terminal.
+- Copilot model availability depends on your plan; if a model is rejected, try
+  another ID (for example `github-copilot/gpt-4.1`).
+- The login stores a GitHub token in the auth profile store and exchanges it for a
+  Copilot API token when Moltbot runs.
--- a/docker-compose/ez-assistant/docs/providers/glm.md
+++ b/docker-compose/ez-assistant/docs/providers/glm.md
@@ -0,0 +1,31 @@
+---
+summary: "GLM model family overview + how to use it in Moltbot"
+read_when:
+  - You want GLM models in Moltbot
+  - You need the model naming convention and setup
+---
+# GLM models
+
+GLM is a **model family** (not a company) available through the Z.AI platform. In Moltbot, GLM
+models are accessed via the `zai` provider and model IDs like `zai/glm-4.7`.
+
+## CLI setup
+
+```bash
+moltbot onboard --auth-choice zai-api-key
+```
+
+## Config snippet
+
+```json5
+{
+  env: { ZAI_API_KEY: "sk-..." },
+  agents: { defaults: { model: { primary: "zai/glm-4.7" } } }
+}
+```
+
+## Notes
+
+- GLM versions and availability can change; check Z.AI's docs for the latest.
+- Example model IDs include `glm-4.7` and `glm-4.6`.
+- For provider details, see [/providers/zai](/providers/zai).
--- a/docker-compose/ez-assistant/docs/providers/index.md
+++ b/docker-compose/ez-assistant/docs/providers/index.md
@@ -0,0 +1,59 @@
+---
+summary: "Model providers (LLMs) supported by Moltbot"
+read_when:
+  - You want to choose a model provider
+  - You need a quick overview of supported LLM backends
+---
+# Model Providers
+
+Moltbot can use many LLM providers. Pick a provider, authenticate, then set the
+default model as `provider/model`.
+
+Looking for chat channel docs (WhatsApp/Telegram/Discord/Slack/Mattermost (plugin)/etc.)? See [Channels](/channels).
+
+## Highlight: Venius (Venice AI)
+
+Venius is our recommended Venice AI setup for privacy-first inference with an option to use Opus for hard tasks.
+
+- Default: `venice/llama-3.3-70b`
+- Best overall: `venice/claude-opus-45` (Opus remains the strongest)
+
+See [Venice AI](/providers/venice).
+
+## Quick start
+
+1) Authenticate with the provider (usually via `moltbot onboard`).
+2) Set the default model:
+
+```json5
+{
+  agents: { defaults: { model: { primary: "anthropic/claude-opus-4-5" } } }
+}
+```
+
+## Provider docs
+
+- [OpenAI (API + Codex)](/providers/openai)
+- [Anthropic (API + Claude Code CLI)](/providers/anthropic)
+- [Qwen (OAuth)](/providers/qwen)
+- [OpenRouter](/providers/openrouter)
+- [Vercel AI Gateway](/providers/vercel-ai-gateway)
+- [Moonshot AI (Kimi + Kimi Code)](/providers/moonshot)
+- [OpenCode Zen](/providers/opencode)
+- [Amazon Bedrock](/bedrock)
+- [Z.AI](/providers/zai)
+- [GLM models](/providers/glm)
+- [MiniMax](/providers/minimax)
+- [Venius (Venice AI, privacy-focused)](/providers/venice)
+- [Ollama (local models)](/providers/ollama)
+
+## Transcription providers
+
+- [Deepgram (audio transcription)](/providers/deepgram)
+
+## Community tools
+
+- [Claude Max API Proxy](/providers/claude-max-api-proxy) - Use Claude Max/Pro subscription as an OpenAI-compatible API endpoint
+
+For the full provider catalog (xAI, Groq, Mistral, etc.) and advanced configuration,
+see [Model providers](/concepts/model-providers).
--- a/docker-compose/ez-assistant/docs/providers/minimax.md
+++ b/docker-compose/ez-assistant/docs/providers/minimax.md
@@ -0,0 +1,183 @@
+---
+summary: "Use MiniMax M2.1 in Moltbot"
+read_when:
+  - You want MiniMax models in Moltbot
+  - You need MiniMax setup guidance
+---
+# MiniMax
+
+MiniMax is an AI company that builds the **M2/M2.1** model family. The current
+coding-focused release is **MiniMax M2.1** (December 23, 2025), built for
+real-world complex tasks.
+
+Source: [MiniMax M2.1 release note](https://www.minimax.io/news/minimax-m21)
+
+## Model overview (M2.1)
+
+MiniMax highlights these improvements in M2.1:
+
+- Stronger **multi-language coding** (Rust, Java, Go, C++, Kotlin, Objective-C, TS/JS).
+- Better **web/app development** and aesthetic output quality (including native mobile).
+- Improved **composite instruction** handling for office-style workflows, building on
+  interleaved thinking and integrated constraint execution.
+- **More concise responses** with lower token usage and faster iteration loops.
+- Stronger **tool/agent framework** compatibility and context management (Claude Code,
+  Droid/Factory AI, Cline, Kilo Code, Roo Code, BlackBox).
+- Higher-quality **dialogue and technical writing** outputs.
+
+## MiniMax M2.1 vs MiniMax M2.1 Lightning
+
+- **Speed:** Lightning is the “fast” variant in MiniMax’s pricing docs.
+- **Cost:** Pricing shows the same input cost, but Lightning has higher output cost.
+- **Coding plan routing:** The Lightning back-end isn’t directly available on the MiniMax
+  coding plan. MiniMax auto-routes most requests to Lightning, but falls back to the
+  regular M2.1 back-end during traffic spikes.
+
+## Choose a setup
+
+### MiniMax M2.1 — recommended
+
+**Best for:** hosted MiniMax with Anthropic-compatible API.
+
+Configure via CLI:
+- Run `moltbot configure`
+- Select **Model/auth**
+- Choose **MiniMax M2.1**
+
+```json5
+{
+  env: { MINIMAX_API_KEY: "sk-..." },
+  agents: { defaults: { model: { primary: "minimax/MiniMax-M2.1" } } },
+  models: {
+    mode: "merge",
+    providers: {
+      minimax: {
+        baseUrl: "https://api.minimax.io/anthropic",
+        apiKey: "${MINIMAX_API_KEY}",
+        api: "anthropic-messages",
+        models: [
+          {
+            id: "MiniMax-M2.1",
+            name: "MiniMax M2.1",
+            reasoning: false,
+            input: ["text"],
+            cost: { input: 15, output: 60, cacheRead: 2, cacheWrite: 10 },
+            contextWindow: 200000,
+            maxTokens: 8192
+          }
+        ]
+      }
+    }
+  }
+}
+```
+
+### MiniMax M2.1 as fallback (Opus primary)
+
+**Best for:** keep Opus 4.5 as primary, fail over to MiniMax M2.1.
+
+```json5
+{
+  env: { MINIMAX_API_KEY: "sk-..." },
+  agents: {
+    defaults: {
+      models: {
+        "anthropic/claude-opus-4-5": { alias: "opus" },
+        "minimax/MiniMax-M2.1": { alias: "minimax" }
+      },
+      model: {
+        primary: "anthropic/claude-opus-4-5",
+        fallbacks: ["minimax/MiniMax-M2.1"]
+      }
+    }
+  }
+}
+```
+
+### Optional: Local via LM Studio (manual)
+
+**Best for:** local inference with LM Studio.
+We have seen strong results with MiniMax M2.1 on powerful hardware (e.g. a
+desktop/server) using LM Studio's local server.
+
+Configure manually via `moltbot.json`:
+
+```json5
+{
+  agents: {
+    defaults: {
+      model: { primary: "lmstudio/minimax-m2.1-gs32" },
+      models: { "lmstudio/minimax-m2.1-gs32": { alias: "Minimax" } }
+    }
+  },
+  models: {
+    mode: "merge",
+    providers: {
+      lmstudio: {
+        baseUrl: "http://127.0.0.1:1234/v1",
+        apiKey: "lmstudio",
+        api: "openai-responses",
+        models: [
+          {
+            id: "minimax-m2.1-gs32",
+            name: "MiniMax M2.1 GS32",
+            reasoning: false,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 196608,
+            maxTokens: 8192
+          }
+        ]
+      }
+    }
+  }
+}
+```
+
+## Configure via `moltbot configure`
+
+Use the interactive config wizard to set MiniMax without editing JSON:
+
+1) Run `moltbot configure`.
+2) Select **Model/auth**.
+3) Choose **MiniMax M2.1**.
+4) Pick your default model when prompted.
+
+## Configuration options
+
+- `models.providers.minimax.baseUrl`: prefer `https://api.minimax.io/anthropic` (Anthropic-compatible); `https://api.minimax.io/v1` is optional for OpenAI-compatible payloads.
+- `models.providers.minimax.api`: prefer `anthropic-messages`; `openai-completions` is optional for OpenAI-compatible payloads.
+- `models.providers.minimax.apiKey`: MiniMax API key (`MINIMAX_API_KEY`).
+- `models.providers.minimax.models`: define `id`, `name`, `reasoning`, `contextWindow`, `maxTokens`, `cost`.
+- `agents.defaults.models`: alias models you want in the allowlist.
+- `models.mode`: keep `merge` if you want to add MiniMax alongside built-ins.
+
+## Notes
+
+- Model refs are `minimax/<model>`.
+- Coding Plan usage API: `https://api.minimaxi.com/v1/api/openplatform/coding_plan/remains` (requires a coding plan key).
+- Update pricing values in `models.json` if you need exact cost tracking.
+- Referral link for MiniMax Coding Plan (10% off): https://platform.minimax.io/subscribe/coding-plan?code=DbXJTRClnb&source=link
+- See [/concepts/model-providers](/concepts/model-providers) for provider rules.
+- Use `moltbot models list` and `moltbot models set minimax/MiniMax-M2.1` to switch.
+
+## Troubleshooting
+
+### “Unknown model: minimax/MiniMax-M2.1”
+
+This usually means the **MiniMax provider isn’t configured** (no provider entry
+and no MiniMax auth profile/env key found). A fix for this detection is in
+**2026.1.12** (unreleased at the time of writing). Fix by:
+- Upgrading to **2026.1.12** (or run from source `main`), then restarting the gateway.
+- Running `moltbot configure` and selecting **MiniMax M2.1**, or
+- Adding the `models.providers.minimax` block manually, or
+- Setting `MINIMAX_API_KEY` (or a MiniMax auth profile) so the provider can be injected.
+
+Make sure the model id is **case‑sensitive**:
+- `minimax/MiniMax-M2.1`
+- `minimax/MiniMax-M2.1-lightning`
+
+Then recheck with:
+```bash
+moltbot models list
+```
--- a/docker-compose/ez-assistant/docs/providers/models.md
+++ b/docker-compose/ez-assistant/docs/providers/models.md
@@ -0,0 +1,48 @@
+---
+summary: "Model providers (LLMs) supported by Moltbot"
+read_when:
+  - You want to choose a model provider
+  - You want quick setup examples for LLM auth + model selection
+---
+# Model Providers
+
+Moltbot can use many LLM providers. Pick one, authenticate, then set the default
+model as `provider/model`.
+
+## Highlight: Venius (Venice AI)
+
+Venius is our recommended Venice AI setup for privacy-first inference with an option to use Opus for the hardest tasks.
+
+- Default: `venice/llama-3.3-70b`
+- Best overall: `venice/claude-opus-45` (Opus remains the strongest)
+
+See [Venice AI](/providers/venice).
+
+## Quick start (two steps)
+
+1) Authenticate with the provider (usually via `moltbot onboard`).
+2) Set the default model:
+
+```json5
+{
+  agents: { defaults: { model: { primary: "anthropic/claude-opus-4-5" } } }
+}
+```
+
+## Supported providers (starter set)
+
+- [OpenAI (API + Codex)](/providers/openai)
+- [Anthropic (API + Claude Code CLI)](/providers/anthropic)
+- [OpenRouter](/providers/openrouter)
+- [Vercel AI Gateway](/providers/vercel-ai-gateway)
+- [Moonshot AI (Kimi + Kimi Code)](/providers/moonshot)
+- [Synthetic](/providers/synthetic)
+- [OpenCode Zen](/providers/opencode)
+- [Z.AI](/providers/zai)
+- [GLM models](/providers/glm)
+- [MiniMax](/providers/minimax)
+- [Venius (Venice AI)](/providers/venice)
+- [Amazon Bedrock](/bedrock)
+
+For the full provider catalog (xAI, Groq, Mistral, etc.) and advanced configuration,
+see [Model providers](/concepts/model-providers).
--- a/docker-compose/ez-assistant/docs/providers/moonshot.md
+++ b/docker-compose/ez-assistant/docs/providers/moonshot.md
@@ -0,0 +1,151 @@
+---
+summary: "Configure Moonshot K2 vs Kimi Code (separate providers + keys)"
+read_when:
+  - You want Moonshot K2 (Moonshot Open Platform) vs Kimi Code setup
+  - You need to understand separate endpoints, keys, and model refs
+  - You want copy/paste config for either provider
+---
+
+# Moonshot AI (Kimi)
+
+Moonshot provides the Kimi API with OpenAI-compatible endpoints. Configure the
+provider and set the default model to `moonshot/kimi-k2-0905-preview`, or use
+Kimi Code with `kimi-code/kimi-for-coding`.
+
+Current Kimi K2 model IDs:
+{/* moonshot-kimi-k2-ids:start */}
+- `kimi-k2-0905-preview`
+- `kimi-k2-turbo-preview`
+- `kimi-k2-thinking`
+- `kimi-k2-thinking-turbo`
+{/* moonshot-kimi-k2-ids:end */}
+
+```bash
+moltbot onboard --auth-choice moonshot-api-key
+```
+
+Kimi Code:
+
+```bash
+moltbot onboard --auth-choice kimi-code-api-key
+```
+
+Note: Moonshot and Kimi Code are separate providers. Keys are not interchangeable, endpoints differ, and model refs differ (Moonshot uses `moonshot/...`, Kimi Code uses `kimi-code/...`).
+
+## Config snippet (Moonshot API)
+
+```json5
+{
+  env: { MOONSHOT_API_KEY: "sk-..." },
+  agents: {
+    defaults: {
+      model: { primary: "moonshot/kimi-k2-0905-preview" },
+      models: {
+        // moonshot-kimi-k2-aliases:start
+        "moonshot/kimi-k2-0905-preview": { alias: "Kimi K2" },
+        "moonshot/kimi-k2-turbo-preview": { alias: "Kimi K2 Turbo" },
+        "moonshot/kimi-k2-thinking": { alias: "Kimi K2 Thinking" },
+        "moonshot/kimi-k2-thinking-turbo": { alias: "Kimi K2 Thinking Turbo" }
+        // moonshot-kimi-k2-aliases:end
+      }
+    }
+  },
+  models: {
+    mode: "merge",
+    providers: {
+      moonshot: {
+        baseUrl: "https://api.moonshot.ai/v1",
+        apiKey: "${MOONSHOT_API_KEY}",
+        api: "openai-completions",
+        models: [
+          // moonshot-kimi-k2-models:start
+          {
+            id: "kimi-k2-0905-preview",
+            name: "Kimi K2 0905 Preview",
+            reasoning: false,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 256000,
+            maxTokens: 8192
+          },
+          {
+            id: "kimi-k2-turbo-preview",
+            name: "Kimi K2 Turbo",
+            reasoning: false,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 256000,
+            maxTokens: 8192
+          },
+          {
+            id: "kimi-k2-thinking",
+            name: "Kimi K2 Thinking",
+            reasoning: true,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 256000,
+            maxTokens: 8192
+          },
+          {
+            id: "kimi-k2-thinking-turbo",
+            name: "Kimi K2 Thinking Turbo",
+            reasoning: true,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 256000,
+            maxTokens: 8192
+          }
+          // moonshot-kimi-k2-models:end
+        ]
+      }
+    }
+  }
+}
+```
+
+## Kimi Code
+
+```json5
+{
+  env: { KIMICODE_API_KEY: "sk-..." },
+  agents: {
+    defaults: {
+      model: { primary: "kimi-code/kimi-for-coding" },
+      models: {
+        "kimi-code/kimi-for-coding": { alias: "Kimi Code" }
+      }
+    }
+  },
+  models: {
+    mode: "merge",
+    providers: {
+      "kimi-code": {
+        baseUrl: "https://api.kimi.com/coding/v1",
+        apiKey: "${KIMICODE_API_KEY}",
+        api: "openai-completions",
+        models: [
+          {
+            id: "kimi-for-coding",
+            name: "Kimi For Coding",
+            reasoning: true,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 262144,
+            maxTokens: 32768,
+            headers: { "User-Agent": "KimiCLI/0.77" },
+            compat: { supportsDeveloperRole: false }
+          }
+        ]
+      }
+    }
+  }
+}
+```
+
+## Notes
+
+- Moonshot model refs use `moonshot/<modelId>`. Kimi Code model refs use `kimi-code/<modelId>`.
+- Override pricing and context metadata in `models.providers` if needed.
+- If Moonshot publishes different context limits for a model, adjust
+  `contextWindow` accordingly.
+- Use `https://api.moonshot.cn/v1` if you need the China endpoint.
--- a/docker-compose/ez-assistant/docs/providers/ollama.md
+++ b/docker-compose/ez-assistant/docs/providers/ollama.md
@@ -0,0 +1,219 @@
+---
+summary: "Run Moltbot with Ollama (local LLM runtime)"
+read_when:
+  - You want to run Moltbot with local models via Ollama
+  - You need Ollama setup and configuration guidance
+---
+# Ollama
+
+Ollama is a local LLM runtime that makes it easy to run open-source models on your machine. Moltbot integrates with Ollama's OpenAI-compatible API and can **auto-discover tool-capable models** when you opt in with `OLLAMA_API_KEY` (or an auth profile) and do not define an explicit `models.providers.ollama` entry.
+
+## Quick start
+
+1) Install Ollama: https://ollama.ai
+
+2) Pull a model:
+
+```bash
+ollama pull llama3.3
+# or
+ollama pull qwen2.5-coder:32b
+# or
+ollama pull deepseek-r1:32b
+```
+
+3) Enable Ollama for Moltbot (any value works; Ollama doesn't require a real key):
+
+```bash
+# Set environment variable
+export OLLAMA_API_KEY="ollama-local"
+
+# Or configure in your config file
+moltbot config set models.providers.ollama.apiKey "ollama-local"
+```
+
+4) Use Ollama models:
+
+```json5
+{
+  agents: {
+    defaults: {
+      model: { primary: "ollama/llama3.3" }
+    }
+  }
+}
+```
+
+## Model discovery (implicit provider)
+
+When you set `OLLAMA_API_KEY` (or an auth profile) and **do not** define `models.providers.ollama`, Moltbot discovers models from the local Ollama instance at `http://127.0.0.1:11434`:
+
+- Queries `/api/tags` and `/api/show`
+- Keeps only models that report `tools` capability
+- Marks `reasoning` when the model reports `thinking`
+- Reads `contextWindow` from `model_info["<arch>.context_length"]` when available
+- Sets `maxTokens` to 10× the context window
+- Sets all costs to `0`
+
+This avoids manual model entries while keeping the catalog aligned with Ollama's capabilities.
+
+To see what models are available:
+
+```bash
+ollama list
+moltbot models list
+```
+
+To add a new model, simply pull it with Ollama:
+
+```bash
+ollama pull mistral
+```
+
+The new model will be automatically discovered and available to use.
+
+If you set `models.providers.ollama` explicitly, auto-discovery is skipped and you must define models manually (see below).
+
+## Configuration
+
+### Basic setup (implicit discovery)
+
+The simplest way to enable Ollama is via environment variable:
+
+```bash
+export OLLAMA_API_KEY="ollama-local"
+```
+
+### Explicit setup (manual models)
+
+Use explicit config when:
+- Ollama runs on another host/port.
+- You want to force specific context windows or model lists.
+- You want to include models that do not report tool support.
+
+```json5
+{
+  models: {
+    providers: {
+      ollama: {
+        // Use a host that includes /v1 for OpenAI-compatible APIs
+        baseUrl: "http://ollama-host:11434/v1",
+        apiKey: "ollama-local",
+        api: "openai-completions",
+        models: [
+          {
+            id: "llama3.3",
+            name: "Llama 3.3",
+            reasoning: false,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 8192,
+            maxTokens: 8192 * 10
+          }
+        ]
+      }
+    }
+  }
+}
+```
+
+If `OLLAMA_API_KEY` is set, you can omit `apiKey` in the provider entry and Moltbot will fill it for availability checks.
+
+### Custom base URL (explicit config)
+
+If Ollama is running on a different host or port (explicit config disables auto-discovery, so define models manually):
+
+```json5
+{
+  models: {
+    providers: {
+      ollama: {
+        apiKey: "ollama-local",
+        baseUrl: "http://ollama-host:11434/v1"
+      }
+    }
+  }
+}
+```
+
+### Model selection
+
+Once configured, all your Ollama models are available:
+
+```json5
+{
+  agents: {
+    defaults: {
+      model: {
+        primary: "ollama/llama3.3",
+        fallback: ["ollama/qwen2.5-coder:32b"]
+      }
+    }
+  }
+}
+```
+
+## Advanced
+
+### Reasoning models
+
+Moltbot marks models as reasoning-capable when Ollama reports `thinking` in `/api/show`:
+
+```bash
+ollama pull deepseek-r1:32b
+```
+
+### Model Costs
+
+Ollama is free and runs locally, so all model costs are set to $0.
+
+### Context windows
+
+For auto-discovered models, Moltbot uses the context window reported by Ollama when available, otherwise it defaults to `8192`. You can override `contextWindow` and `maxTokens` in explicit provider config.
+
+## Troubleshooting
+
+### Ollama not detected
+
+Make sure Ollama is running and that you set `OLLAMA_API_KEY` (or an auth profile), and that you did **not** define an explicit `models.providers.ollama` entry:
+
+```bash
+ollama serve
+```
+
+And that the API is accessible:
+
+```bash
+curl http://localhost:11434/api/tags
+```
+
+### No models available
+
+Moltbot only auto-discovers models that report tool support. If your model isn't listed, either:
+- Pull a tool-capable model, or
+- Define the model explicitly in `models.providers.ollama`.
+
+To add models:
+
+```bash
+ollama list  # See what's installed
+ollama pull llama3.3  # Pull a model
+```
+
+### Connection refused
+
+Check that Ollama is running on the correct port:
+
+```bash
+# Check if Ollama is running
+ps aux | grep ollama
+
+# Or restart Ollama
+ollama serve
+```
+
+## See Also
+
+- [Model Providers](/concepts/model-providers) - Overview of all providers
+- [Model Selection](/concepts/models) - How to choose models
+- [Configuration](/gateway/configuration) - Full config reference
--- a/docker-compose/ez-assistant/docs/providers/openai.md
+++ b/docker-compose/ez-assistant/docs/providers/openai.md
@@ -0,0 +1,60 @@
+---
+summary: "Use OpenAI via API keys or Codex subscription in Moltbot"
+read_when:
+  - You want to use OpenAI models in Moltbot
+  - You want Codex subscription auth instead of API keys
+---
+# OpenAI
+
+OpenAI provides developer APIs for GPT models. Codex supports **ChatGPT sign-in** for subscription
+access or **API key** sign-in for usage-based access. Codex cloud requires ChatGPT sign-in.
+
+## Option A: OpenAI API key (OpenAI Platform)
+
+**Best for:** direct API access and usage-based billing.
+Get your API key from the OpenAI dashboard.
+
+### CLI setup
+
+```bash
+moltbot onboard --auth-choice openai-api-key
+# or non-interactive
+moltbot onboard --openai-api-key "$OPENAI_API_KEY"
+```
+
+### Config snippet
+
+```json5
+{
+  env: { OPENAI_API_KEY: "sk-..." },
+  agents: { defaults: { model: { primary: "openai/gpt-5.2" } } }
+}
+```
+
+## Option B: OpenAI Code (Codex) subscription
+
+**Best for:** using ChatGPT/Codex subscription access instead of an API key.
+Codex cloud requires ChatGPT sign-in, while the Codex CLI supports ChatGPT or API key sign-in.
+
+### CLI setup
+
+```bash
+# Run Codex OAuth in the wizard
+moltbot onboard --auth-choice openai-codex
+
+# Or run OAuth directly
+moltbot models auth login --provider openai-codex
+```
+
+### Config snippet
+
+```json5
+{
+  agents: { defaults: { model: { primary: "openai-codex/gpt-5.2" } } }
+}
+```
+
+## Notes
+
+- Model refs always use `provider/model` (see [/concepts/models](/concepts/models)).
+- Auth details + reuse rules are in [/concepts/oauth](/concepts/oauth).
--- a/docker-compose/ez-assistant/docs/providers/opencode.md
+++ b/docker-compose/ez-assistant/docs/providers/opencode.md
@@ -0,0 +1,34 @@
+---
+summary: "Use OpenCode Zen (curated models) with Moltbot"
+read_when:
+  - You want OpenCode Zen for model access
+  - You want a curated list of coding-friendly models
+---
+# OpenCode Zen
+
+OpenCode Zen is a **curated list of models** recommended by the OpenCode team for coding agents.
+It is an optional, hosted model access path that uses an API key and the `opencode` provider.
+Zen is currently in beta.
+
+## CLI setup
+
+```bash
+moltbot onboard --auth-choice opencode-zen
+# or non-interactive
+moltbot onboard --opencode-zen-api-key "$OPENCODE_API_KEY"
+```
+
+## Config snippet
+
+```json5
+{
+  env: { OPENCODE_API_KEY: "sk-..." },
+  agents: { defaults: { model: { primary: "opencode/claude-opus-4-5" } } }
+}
+```
+
+## Notes
+
+- `OPENCODE_ZEN_API_KEY` is also supported.
+- You sign in to Zen, add billing details, and copy your API key.
+- OpenCode Zen bills per request; check the OpenCode dashboard for details.
--- a/docker-compose/ez-assistant/docs/providers/openrouter.md
+++ b/docker-compose/ez-assistant/docs/providers/openrouter.md
@@ -0,0 +1,35 @@
+---
+summary: "Use OpenRouter's unified API to access many models in Moltbot"
+read_when:
+  - You want a single API key for many LLMs
+  - You want to run models via OpenRouter in Moltbot
+---
+# OpenRouter
+
+OpenRouter provides a **unified API** that routes requests to many models behind a single
+endpoint and API key. It is OpenAI-compatible, so most OpenAI SDKs work by switching the base URL.
+
+## CLI setup
+
+```bash
+moltbot onboard --auth-choice apiKey --token-provider openrouter --token "$OPENROUTER_API_KEY"
+```
+
+## Config snippet
+
+```json5
+{
+  env: { OPENROUTER_API_KEY: "sk-or-..." },
+  agents: {
+    defaults: {
+      model: { primary: "openrouter/anthropic/claude-sonnet-4-5" }
+    }
+  }
+}
+```
+
+## Notes
+
+- Model refs are `openrouter/<provider>/<model>`.
+- For more model/provider options, see [/concepts/model-providers](/concepts/model-providers).
+- OpenRouter uses a Bearer token with your API key under the hood.
--- a/docker-compose/ez-assistant/docs/providers/qwen.md
+++ b/docker-compose/ez-assistant/docs/providers/qwen.md
@@ -0,0 +1,51 @@
+---
+summary: "Use Qwen OAuth (free tier) in Moltbot"
+read_when:
+  - You want to use Qwen with Moltbot
+  - You want free-tier OAuth access to Qwen Coder
+---
+# Qwen
+
+Qwen provides a free-tier OAuth flow for Qwen Coder and Qwen Vision models
+(2,000 requests/day, subject to Qwen rate limits).
+
+## Enable the plugin
+
+```bash
+moltbot plugins enable qwen-portal-auth
+```
+
+Restart the Gateway after enabling.
+
+## Authenticate
+
+```bash
+moltbot models auth login --provider qwen-portal --set-default
+```
+
+This runs the Qwen device-code OAuth flow and writes a provider entry to your
+`models.json` (plus a `qwen` alias for quick switching).
+
+## Model IDs
+
+- `qwen-portal/coder-model`
+- `qwen-portal/vision-model`
+
+Switch models with:
+
+```bash
+moltbot models set qwen-portal/coder-model
+```
+
+## Reuse Qwen Code CLI login
+
+If you already logged in with the Qwen Code CLI, Moltbot will sync credentials
+from `~/.qwen/oauth_creds.json` when it loads the auth store. You still need a
+`models.providers.qwen-portal` entry (use the login command above to create one).
+
+## Notes
+
+- Tokens auto-refresh; re-run the login command if refresh fails or access is revoked.
+- Default base URL: `https://portal.qwen.ai/v1` (override with
+  `models.providers.qwen-portal.baseUrl` if Qwen provides a different endpoint).
+- See [Model providers](/concepts/model-providers) for provider-wide rules.
--- a/docker-compose/ez-assistant/docs/providers/synthetic.md
+++ b/docker-compose/ez-assistant/docs/providers/synthetic.md
@@ -0,0 +1,97 @@
+---
+summary: "Use Synthetic's Anthropic-compatible API in Moltbot"
+read_when:
+  - You want to use Synthetic as a model provider
+  - You need a Synthetic API key or base URL setup
+---
+# Synthetic
+
+Synthetic exposes Anthropic-compatible endpoints. Moltbot registers it as the
+`synthetic` provider and uses the Anthropic Messages API.
+
+## Quick setup
+
+1) Set `SYNTHETIC_API_KEY` (or run the wizard below).
+2) Run onboarding:
+
+```bash
+moltbot onboard --auth-choice synthetic-api-key
+```
+
+The default model is set to:
+
+```
+synthetic/hf:MiniMaxAI/MiniMax-M2.1
+```
+
+## Config example
+
+```json5
+{
+  env: { SYNTHETIC_API_KEY: "sk-..." },
+  agents: {
+    defaults: {
+      model: { primary: "synthetic/hf:MiniMaxAI/MiniMax-M2.1" },
+      models: { "synthetic/hf:MiniMaxAI/MiniMax-M2.1": { alias: "MiniMax M2.1" } }
+    }
+  },
+  models: {
+    mode: "merge",
+    providers: {
+      synthetic: {
+        baseUrl: "https://api.synthetic.new/anthropic",
+        apiKey: "${SYNTHETIC_API_KEY}",
+        api: "anthropic-messages",
+        models: [
+          {
+            id: "hf:MiniMaxAI/MiniMax-M2.1",
+            name: "MiniMax M2.1",
+            reasoning: false,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 192000,
+            maxTokens: 65536
+          }
+        ]
+      }
+    }
+  }
+}
+```
+
+Note: Moltbot's Anthropic client appends `/v1` to the base URL, so use
+`https://api.synthetic.new/anthropic` (not `/anthropic/v1`). If Synthetic changes
+its base URL, override `models.providers.synthetic.baseUrl`.
+
+## Model catalog
+
+All models below use cost `0` (input/output/cache).
+
+| Model ID | Context window | Max tokens | Reasoning | Input |
+| --- | --- | --- | --- | --- |
+| `hf:MiniMaxAI/MiniMax-M2.1` | 192000 | 65536 | false | text |
+| `hf:moonshotai/Kimi-K2-Thinking` | 256000 | 8192 | true | text |
+| `hf:zai-org/GLM-4.7` | 198000 | 128000 | false | text |
+| `hf:deepseek-ai/DeepSeek-R1-0528` | 128000 | 8192 | false | text |
+| `hf:deepseek-ai/DeepSeek-V3-0324` | 128000 | 8192 | false | text |
+| `hf:deepseek-ai/DeepSeek-V3.1` | 128000 | 8192 | false | text |
+| `hf:deepseek-ai/DeepSeek-V3.1-Terminus` | 128000 | 8192 | false | text |
+| `hf:deepseek-ai/DeepSeek-V3.2` | 159000 | 8192 | false | text |
+| `hf:meta-llama/Llama-3.3-70B-Instruct` | 128000 | 8192 | false | text |
+| `hf:meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8` | 524000 | 8192 | false | text |
+| `hf:moonshotai/Kimi-K2-Instruct-0905` | 256000 | 8192 | false | text |
+| `hf:openai/gpt-oss-120b` | 128000 | 8192 | false | text |
+| `hf:Qwen/Qwen3-235B-A22B-Instruct-2507` | 256000 | 8192 | false | text |
+| `hf:Qwen/Qwen3-Coder-480B-A35B-Instruct` | 256000 | 8192 | false | text |
+| `hf:Qwen/Qwen3-VL-235B-A22B-Instruct` | 250000 | 8192 | false | text + image |
+| `hf:zai-org/GLM-4.5` | 128000 | 128000 | false | text |
+| `hf:zai-org/GLM-4.6` | 198000 | 128000 | false | text |
+| `hf:deepseek-ai/DeepSeek-V3` | 128000 | 8192 | false | text |
+| `hf:Qwen/Qwen3-235B-A22B-Thinking-2507` | 256000 | 8192 | true | text |
+
+## Notes
+
+- Model refs use `synthetic/<modelId>`.
+- If you enable a model allowlist (`agents.defaults.models`), add every model you
+  plan to use.
+- See [Model providers](/concepts/model-providers) for provider rules.
--- a/docker-compose/ez-assistant/docs/providers/venice.md
+++ b/docker-compose/ez-assistant/docs/providers/venice.md
@@ -0,0 +1,264 @@
+---
+summary: "Use Venice AI privacy-focused models in Moltbot"
+read_when:
+  - You want privacy-focused inference in Moltbot
+  - You want Venice AI setup guidance
+---
+# Venice AI (Venius highlight)
+
+**Venius** is our highlight Venice setup for privacy-first inference with optional anonymized access to proprietary models.
+
+Venice AI provides privacy-focused AI inference with support for uncensored models and access to major proprietary models through their anonymized proxy. All inference is private by default—no training on your data, no logging.
+
+## Why Venice in Moltbot
+
+- **Private inference** for open-source models (no logging).
+- **Uncensored models** when you need them.
+- **Anonymized access** to proprietary models (Opus/GPT/Gemini) when quality matters.
+- OpenAI-compatible `/v1` endpoints.
+
+## Privacy Modes
+
+Venice offers two privacy levels — understanding this is key to choosing your model:
+
+| Mode | Description | Models |
+|------|-------------|--------|
+| **Private** | Fully private. Prompts/responses are **never stored or logged**. Ephemeral. | Llama, Qwen, DeepSeek, Venice Uncensored, etc. |
+| **Anonymized** | Proxied through Venice with metadata stripped. The underlying provider (OpenAI, Anthropic) sees anonymized requests. | Claude, GPT, Gemini, Grok, Kimi, MiniMax |
+
+## Features
+
+- **Privacy-focused**: Choose between "private" (fully private) and "anonymized" (proxied) modes
+- **Uncensored models**: Access to models without content restrictions
+- **Major model access**: Use Claude, GPT-5.2, Gemini, Grok via Venice's anonymized proxy
+- **OpenAI-compatible API**: Standard `/v1` endpoints for easy integration
+- **Streaming**: ✅ Supported on all models
+- **Function calling**: ✅ Supported on select models (check model capabilities)
+- **Vision**: ✅ Supported on models with vision capability
+- **No hard rate limits**: Fair-use throttling may apply for extreme usage
+
+## Setup
+
+### 1. Get API Key
+
+1. Sign up at [venice.ai](https://venice.ai)
+2. Go to **Settings → API Keys → Create new key**
+3. Copy your API key (format: `vapi_xxxxxxxxxxxx`)
+
+### 2. Configure Moltbot
+
+**Option A: Environment Variable**
+
+```bash
+export VENICE_API_KEY="vapi_xxxxxxxxxxxx"
+```
+
+**Option B: Interactive Setup (Recommended)**
+
+```bash
+moltbot onboard --auth-choice venice-api-key
+```
+
+This will:
+1. Prompt for your API key (or use existing `VENICE_API_KEY`)
+2. Show all available Venice models
+3. Let you pick your default model
+4. Configure the provider automatically
+
+**Option C: Non-interactive**
+
+```bash
+moltbot onboard --non-interactive \
+  --auth-choice venice-api-key \
+  --venice-api-key "vapi_xxxxxxxxxxxx"
+```
+
+### 3. Verify Setup
+
+```bash
+moltbot chat --model venice/llama-3.3-70b "Hello, are you working?"
+```
+
+## Model Selection
+
+After setup, Moltbot shows all available Venice models. Pick based on your needs:
+
+- **Default (our pick)**: `venice/llama-3.3-70b` for private, balanced performance.
+- **Best overall quality**: `venice/claude-opus-45` for hard jobs (Opus remains the strongest).
+- **Privacy**: Choose "private" models for fully private inference.
+- **Capability**: Choose "anonymized" models to access Claude, GPT, Gemini via Venice's proxy.
+
+Change your default model anytime:
+
+```bash
+moltbot models set venice/claude-opus-45
+moltbot models set venice/llama-3.3-70b
+```
+
+List all available models:
+
+```bash
+moltbot models list | grep venice
+```
+
+## Configure via `moltbot configure`
+
+1. Run `moltbot configure`
+2. Select **Model/auth**
+3. Choose **Venice AI**
+
+## Which Model Should I Use?
+
+| Use Case | Recommended Model | Why |
+|----------|-------------------|-----|
+| **General chat** | `llama-3.3-70b` | Good all-around, fully private |
+| **Best overall quality** | `claude-opus-45` | Opus remains the strongest for hard tasks |
+| **Privacy + Claude quality** | `claude-opus-45` | Best reasoning via anonymized proxy |
+| **Coding** | `qwen3-coder-480b-a35b-instruct` | Code-optimized, 262k context |
+| **Vision tasks** | `qwen3-vl-235b-a22b` | Best private vision model |
+| **Uncensored** | `venice-uncensored` | No content restrictions |
+| **Fast + cheap** | `qwen3-4b` | Lightweight, still capable |
+| **Complex reasoning** | `deepseek-v3.2` | Strong reasoning, private |
+
+## Available Models (25 Total)
+
+### Private Models (15) — Fully Private, No Logging
+
+| Model ID | Name | Context (tokens) | Features |
+|----------|------|------------------|----------|
+| `llama-3.3-70b` | Llama 3.3 70B | 131k | General |
+| `llama-3.2-3b` | Llama 3.2 3B | 131k | Fast, lightweight |
+| `hermes-3-llama-3.1-405b` | Hermes 3 Llama 3.1 405B | 131k | Complex tasks |
+| `qwen3-235b-a22b-thinking-2507` | Qwen3 235B Thinking | 131k | Reasoning |
+| `qwen3-235b-a22b-instruct-2507` | Qwen3 235B Instruct | 131k | General |
+| `qwen3-coder-480b-a35b-instruct` | Qwen3 Coder 480B | 262k | Code |
+| `qwen3-next-80b` | Qwen3 Next 80B | 262k | General |
+| `qwen3-vl-235b-a22b` | Qwen3 VL 235B | 262k | Vision |
+| `qwen3-4b` | Venice Small (Qwen3 4B) | 32k | Fast, reasoning |
+| `deepseek-v3.2` | DeepSeek V3.2 | 163k | Reasoning |
+| `venice-uncensored` | Venice Uncensored | 32k | Uncensored |
+| `mistral-31-24b` | Venice Medium (Mistral) | 131k | Vision |
+| `google-gemma-3-27b-it` | Gemma 3 27B Instruct | 202k | Vision |
+| `openai-gpt-oss-120b` | OpenAI GPT OSS 120B | 131k | General |
+| `zai-org-glm-4.7` | GLM 4.7 | 202k | Reasoning, multilingual |
+
+### Anonymized Models (10) — Via Venice Proxy
+
+| Model ID | Original | Context (tokens) | Features |
+|----------|----------|------------------|----------|
+| `claude-opus-45` | Claude Opus 4.5 | 202k | Reasoning, vision |
+| `claude-sonnet-45` | Claude Sonnet 4.5 | 202k | Reasoning, vision |
+| `openai-gpt-52` | GPT-5.2 | 262k | Reasoning |
+| `openai-gpt-52-codex` | GPT-5.2 Codex | 262k | Reasoning, vision |
+| `gemini-3-pro-preview` | Gemini 3 Pro | 202k | Reasoning, vision |
+| `gemini-3-flash-preview` | Gemini 3 Flash | 262k | Reasoning, vision |
+| `grok-41-fast` | Grok 4.1 Fast | 262k | Reasoning, vision |
+| `grok-code-fast-1` | Grok Code Fast 1 | 262k | Reasoning, code |
+| `kimi-k2-thinking` | Kimi K2 Thinking | 262k | Reasoning |
+| `minimax-m21` | MiniMax M2.1 | 202k | Reasoning |
+
+## Model Discovery
+
+Moltbot automatically discovers models from the Venice API when `VENICE_API_KEY` is set. If the API is unreachable, it falls back to a static catalog.
+
+The `/models` endpoint is public (no auth needed for listing), but inference requires a valid API key.
+
+## Streaming & Tool Support
+
+| Feature | Support |
+|---------|---------|
+| **Streaming** | ✅ All models |
+| **Function calling** | ✅ Most models (check `supportsFunctionCalling` in API) |
+| **Vision/Images** | ✅ Models marked with "Vision" feature |
+| **JSON mode** | ✅ Supported via `response_format` |
+
+## Pricing
+
+Venice uses a credit-based system. Check [venice.ai/pricing](https://venice.ai/pricing) for current rates:
+
+- **Private models**: Generally lower cost
+- **Anonymized models**: Similar to direct API pricing + small Venice fee
+
+## Comparison: Venice vs Direct API
+
+| Aspect | Venice (Anonymized) | Direct API |
+|--------|---------------------|------------|
+| **Privacy** | Metadata stripped, anonymized | Your account linked |
+| **Latency** | +10-50ms (proxy) | Direct |
+| **Features** | Most features supported | Full features |
+| **Billing** | Venice credits | Provider billing |
+
+## Usage Examples
+
+```bash
+# Use default private model
+moltbot chat --model venice/llama-3.3-70b
+
+# Use Claude via Venice (anonymized)
+moltbot chat --model venice/claude-opus-45
+
+# Use uncensored model
+moltbot chat --model venice/venice-uncensored
+
+# Use vision model with image
+moltbot chat --model venice/qwen3-vl-235b-a22b
+
+# Use coding model
+moltbot chat --model venice/qwen3-coder-480b-a35b-instruct
+```
+
+## Troubleshooting
+
+### API key not recognized
+
+```bash
+echo $VENICE_API_KEY
+moltbot models list | grep venice
+```
+
+Ensure the key starts with `vapi_`.
+
+### Model not available
+
+The Venice model catalog updates dynamically. Run `moltbot models list` to see currently available models. Some models may be temporarily offline.
+
+### Connection issues
+
+Venice API is at `https://api.venice.ai/api/v1`. Ensure your network allows HTTPS connections.
+
+## Config file example
+
+```json5
+{
+  env: { VENICE_API_KEY: "vapi_..." },
+  agents: { defaults: { model: { primary: "venice/llama-3.3-70b" } } },
+  models: {
+    mode: "merge",
+    providers: {
+      venice: {
+        baseUrl: "https://api.venice.ai/api/v1",
+        apiKey: "${VENICE_API_KEY}",
+        api: "openai-completions",
+        models: [
+          {
+            id: "llama-3.3-70b",
+            name: "Llama 3.3 70B",
+            reasoning: false,
+            input: ["text"],
+            cost: { input: 0, output: 0, cacheRead: 0, cacheWrite: 0 },
+            contextWindow: 131072,
+            maxTokens: 8192
+          }
+        ]
+      }
+    }
+  }
+}
+```
+
+## Links
+
+- [Venice AI](https://venice.ai)
+- [API Documentation](https://docs.venice.ai)
+- [Pricing](https://venice.ai/pricing)
+- [Status](https://status.venice.ai)
--- a/docker-compose/ez-assistant/docs/providers/vercel-ai-gateway.md
+++ b/docker-compose/ez-assistant/docs/providers/vercel-ai-gateway.md
@@ -0,0 +1,50 @@
+---
+title: "Vercel AI Gateway"
+summary: "Vercel AI Gateway setup (auth + model selection)"
+read_when:
+  - You want to use Vercel AI Gateway with Moltbot
+  - You need the API key env var or CLI auth choice
+---
+# Vercel AI Gateway
+
+
+The [Vercel AI Gateway](https://vercel.com/ai-gateway) provides a unified API to access hundreds of models through a single endpoint. 
+
+- Provider: `vercel-ai-gateway`
+- Auth: `AI_GATEWAY_API_KEY`
+- API: Anthropic Messages compatible
+
+## Quick start
+
+1) Set the API key (recommended: store it for the Gateway):
+
+```bash
+moltbot onboard --auth-choice ai-gateway-api-key
+```
+
+2) Set a default model:
+
+```json5
+{
+  agents: {
+    defaults: {
+      model: { primary: "vercel-ai-gateway/anthropic/claude-opus-4.5" }
+    }
+  }
+}
+```
+
+## Non-interactive example
+
+```bash
+moltbot onboard --non-interactive \
+  --mode local \
+  --auth-choice ai-gateway-api-key \
+  --ai-gateway-api-key "$AI_GATEWAY_API_KEY"
+```
+
+## Environment note
+
+If the Gateway runs as a daemon (launchd/systemd), make sure `AI_GATEWAY_API_KEY`
+is available to that process (for example, in `~/.clawdbot/.env` or via
+`env.shellEnv`).
--- a/docker-compose/ez-assistant/docs/providers/zai.md
+++ b/docker-compose/ez-assistant/docs/providers/zai.md
@@ -0,0 +1,34 @@
+---
+summary: "Use Z.AI (GLM models) with Moltbot"
+read_when:
+  - You want Z.AI / GLM models in Moltbot
+  - You need a simple ZAI_API_KEY setup
+---
+# Z.AI
+
+Z.AI is the API platform for **GLM** models. It provides REST APIs for GLM and uses API keys
+for authentication. Create your API key in the Z.AI console. Moltbot uses the `zai` provider
+with a Z.AI API key.
+
+## CLI setup
+
+```bash
+moltbot onboard --auth-choice zai-api-key
+# or non-interactive
+moltbot onboard --zai-api-key "$ZAI_API_KEY"
+```
+
+## Config snippet
+
+```json5
+{
+  env: { ZAI_API_KEY: "sk-..." },
+  agents: { defaults: { model: { primary: "zai/glm-4.7" } } }
+}
+```
+
+## Notes
+
+- GLM models are available as `zai/<model>` (example: `zai/glm-4.7`).
+- See [/providers/glm](/providers/glm) for the model family overview.
+- Z.AI uses Bearer auth with your API key.