绿色记忆:OpenClaw: Architecture, Components, and Deployment Notes

OpenClaw: Architecture, Components, and Deployment Notes

/ in AI

Four Months, 343,000 Stars

On November 24, 2025, an open source project named OpenClaw quietly appeared on GitHub. Four months later it had passed 343,000 stars, putting it among the fastest-growing non-aggregating open source projects in recent memory and ahead of React, Vue, and Tailwind CSS over comparable early windows. The pitch behind that curve is simple: an AI assistant that runs on your own device and belongs to you.

OpenClaw describes itself as "Your own personal AI assistant. Any OS. Any Platform. The lobster way. " The interesting part is not the slogan itself but the position behind it. Local-First is not an optional mode layered onto a cloud product. It is the premise the rest of the system inherits. In a market where AI assistants increasingly centralize data and execution, that position lines up directly with developer demand for data sovereignty.

"The lobster way" also shows up in the project's internal language. The workflow orchestration tool is called Lobster. Community members call themselves lobsters. Even the GitHub chant, "EXFOLIATE! EXFOLIATE!", points at the same idea: growth through repeated shedding and rebuilds. OpenClaw treats aggressive refactoring as part of how the architecture evolves.

From an engineering angle, the more interesting story is the stack: TypeScript ESM, a pnpm monorepo, 230 Plugin SDK export paths, and 24 channel integrations inside one codebase. Rust and Go tools replace older JavaScript tooling in several places. The repository also folds in the TypeScript native Go compiler preview and handles QEMU cross-compilation carefully in its Docker build. Those choices say more about the project than any feature checklist.

The project is released under the MIT license and backed by sponsors including OpenAI, NVIDIA, Vercel, Blacksmith, and Convex. A cloud AI company funding a local-first open source competitor is not a trivial detail. This note reads the OpenClaw repository (github.com/openclaw/openclaw), version v2026.4.1, from the code outward: repository layout, plugin architecture, channels, runtime, memory, security, and native clients. All code references come from the upstream repository rather than secondary commentary.

Four-Layer Architecture

Before diving into the code, establish the system shape first. OpenClaw can be read as four layers:

Gateway (control plane): Hosts session management, configuration delivery, Cron scheduling, Webhooks and health checks as a WebSocket service (default ws://127.0.0.1:18789), and also hosts Control UI (Lit 3 + Vite) and Canvas hosting (A2UI).
Agent / Pi Runtime: Based on @mariozechner/pi-agent-core@0.64.0, runs in RPC mode, supports Tool Streaming and Block Streaming, accesses 25+ model providers, and has Auth rotation and Failover capabilities.
Channels + Skills (channels and skills layer): Covers 24 messaging platforms and interacts with the core through the 230 contract paths of the Plugin SDK; the ClawHub market, the before_install security hook, and tools such as Browser, Canvas, Nodes, Cron, and Sessions also sit in this layer.
Memory (memory layer): It is composed of 13 sub-modules of memory-core, with local Markdown file persistence, sqlite-vec vector search and LanceDB as the storage backend, carrying user-editable preferences and long-term context.

Gateway is the single control-plane entry point. Every client, including the CLI, Web UI, macOS app, and iOS or Android nodes, connects to Gateway over WebSocket. The Agent runtime sits under Gateway in RPC mode, receives messages from each channel, calls models and tools, and routes the result back to the channel where the request started. Skills and channels talk to the core through the 230 exported Plugin SDK paths, while the Memory layer gives the Agent long-lived context across sessions.

Each layer also has a hard boundary. Gateway defines a typed WebSocket protocol in src/gateway/protocol/schema.ts. The Agent layer exposes capabilities through Pi's RPC surface. The Plugin SDK is the only legal import surface between extensions and the core. The Memory layer is split into 13 smaller modules to avoid monolithic coupling. The rest of the article walks through those layers one by one.

Project History And Release Model

MoltBot → ClawdBot → OpenClaw

OpenClaw is not starting from scratch. Before its current naming, it went through two stages: MoltBot and ClawdBot. Traces of this history remain in the codebase: the scripts field of package.json still retains the "moltbot:rpc" command, pointing to the exact same implementation as "openclaw:rpc". The documentation domain docs.molt.bot still redirects to docs.openclaw.ai with an HTTP 301.

The project is led by the Austrian developer Peter Steinberger (GitHub: @steipete), who has 14,756 commits in the repository, far more than the second contributor (1,690). Steinberger was previously known for his contributions to the iOS SDK ecosystem. He transformed into an AI Agent platform developer and continued his style of high-frequency iteration and radical reconstruction into the development of OpenClaw.

Calendar Versioning And Release Channels

OpenClaw uses calendar versioning rather than semantic versioning. Version format is vYYYY.M.D (e.g. v2026.4.1), which directly reflects the release date. When there are multiple releases on the same day, append the patch suffix vYYYY.M.D-N.

The release channel is divided into three layers, mapped through npm dist-tag:

Channel	npm dist-tag	Tag format	Applicable scenarios
stable	latest	vYYYY.M.D	Production environment, default installation
beta	beta	vYYYY.M.D-beta.N	Pre-release verification, macOS App may be absent
dev	dev	main branch head	Develop and debug, release on demand

To switch channels, use openclaw update --channel stable|beta|dev. When Beta is released, the npm version number must carry the -beta.N suffix. You cannot use the version number without the suffix with --tag beta release, otherwise the version identifier will be consumed - this is a release rule clearly recorded in the repository AGENTS.md.

v2026.3.31: A Dense Breaking Release

As of the writing of this article, the latest stable version is v2026.3.31 (released on 2026-03-31). This version contains six breaking changes (Breaking Changes), and the high density reflects OpenClaw's aggressive iteration style:

Nodes/exec Refactoring: Removed the duplicate nodes.run shell wrapper in CLI and Agent nodes tools, and all node shell executions use the exec host=node path. Node-specific capabilities are reserved for nodes invoke and specialized media/location/notify operations.
Plugin SDK legacy path deprecation: The old provider compatibility subpath and the old bundled provider setup and channel-runtime compatibility shims are deprecated and a migration warning is issued. The currently documented openclaw/plugin-sdk/* entries plus the local api.ts and runtime-api.ts barrel files are the only path forward.
Plugin Installation Security Tightening: Built-in dangerous-code critical-level discovery and installation scan failure are now denied by default (fail closed). Some plugins that previously installed successfully now require the --dangerously-force-unsafe-install flag to be explicitly specified in order to continue.
Gateway authentication tightening: trusted-proxy mode rejects mixed shared token configurations; local-direct fallback requires the use of configured tokens and no longer implicitly authenticates callers on the same host.
Node command gating: Node commands remain disabled until node pairing is approved. Simply completing device pairing is no longer sufficient to expose declared node commands.
Reduced Trust Surface for Node Events: Node-originated runs now execute on a reduced trusted surface. Notification-driven or node-triggered processes that rely on broader host/session tool access may need to be adjusted.

This "every version may have Breaking Changes" strategy echoes the choice of calendar version numbers - since no semantic compatibility promises are provided, each snapshot is clearly identified with the release date. In practice, users should pin a specific version and read the CHANGELOG before upgrading.

v2026.3.28: xAI Integration And Security Tightening

The previous notable release v2026.3.28 (released on 2026-03-29) also contains a number of important changes:

xAI integration: bundled xAI provider migrated to Responses API, added native x_search (Grok web search tool), and integrated optional x_search configuration steps in openclaw onboard.
MiniMax image generation: Added image generation and image editing capabilities for the MiniMax image-01 model, supporting aspect ratio control.
Qwen Authentication Migration: Remove deprecated qwen-portal-auth OAuth integration and migrate to Model Studio API Key mode.
Plugins/hooks approval mechanism: The before_tool_call hook adds the asynchronous requireApproval capability. The plugin can pause the tool execution and prompt the user for approval through channels such as Telegram buttons, Discord interactions, /approve commands, etc.
Microsoft Teams upgrade: Migrate to the official Teams SDK to support streaming replies and AI annotations for 1:1 conversations.
Gateway OpenAI Compatibility: Added /v1/models and /v1/embeddings endpoints so that Gateway can be directly called by OpenAI-compatible third-party tools.

Repository Structure

Top-Level Layout

OpenClaw is a pnpm workspace monorepo. The core layout of the root directory is as follows:

openclaw/

├── src/ # Core source code

│ ├── cli/ # CLI command entry and progress bar

│ ├── commands/ # Implementation of each subcommand

│ ├── gateway/ # Gateway control plane (including protocol/ subdirectory)

│ ├── channels/ # Core channel implementation

│ ├── routing/ # Message routing

│ ├── plugins/ # Plugin discovery, loading, registration

│ ├── plugin-sdk/# Public plugin contract (the only legal import side)

│ ├── infra/ # Infrastructure (SQLite, file locks, etc.)

│ └── media/ # Media processing pipeline

├── apps/

│ ├── macos/ # SwiftUI + AppKit menu bar application│ ├── ios/ # Xcode + SwiftUI

│ └── android/ # Kotlin + Gradle

├── extensions/ # Internal extensions (bundled plugin workspace tree)

├── packages/ # Shared package

├── skills/ # Built-in Skills (distributed with npm package)

├── ui/ # Web Control UI (Lit 3 + Vite)

├── docs/ # Mintlify documentation

├── test/ # E2E test

└── scripts/ # Build/publish/check scripts (60+)

This directory tree reveals OpenClaw's engineering philosophy: The core is as thin as possible and the boundary is as hard as possible. src/ stores all TypeScript core code, extensions/ stores built-in extensions (bundled plugin workspace tree), and apps/ stores three native clients. The import relationship between the three is one-way: extensions/ can only call core capabilities through openclaw/plugin-sdk/*, and apps/ communicates with the core through the Gateway WebSocket protocol. Any reverse dependencies will be intercepted by CI's architecture guard script.

extensions/ Versus packages/

extensions/ is a bundled plugin workspace tree - built-in extensions distributed with npm packages. Channel plugins such as Matrix, Zalo, ZaloUser, Voice Call, and diagnostic telemetry (diagnostics-otel) are stored here. Each extension is an independent pnpm workspace package with its own package.json and openclaw.plugin.json manifest files. Extended runtime dependencies must be declared in their own dependencies and cannot be added to the root package.json (unless the core also uses the same dependency). While workspace:* is prohibited in dependencies (because npm install cannot resolve the workspace protocol), openclaw itself should be put into devDependencies or peerDependencies, and openclaw/plugin-sdk is resolved through the jiti alias at runtime.

packages/ stores pure shared library packages, does not contain a plugin list, and does not use the plugin loading pipeline. They provide utility functions and type definitions that are reusable across packages.

What skills/ And docs/ Do

The

skills/ directory stores Built-in Skills (Bundled Skills) - they are distributed with the npm package and can be used after installation. Unlike third-party Skills on ClawHub, built-in Skills do not require clawhub install and do not go through the before_install security check pipeline. Each Skill is described by a SKILL.md file, which is injected into the system prompt when the Agent is running.

docs/ is built using the Mintlify framework and deployed at docs.openclaw.ai. Links within documents use root-relative paths (such as [Config](/configuration)) without the .md extension. The document supports Chinese translation. The Chinese version is located at docs/zh-CN/ and is automatically generated by the scripts/docs-i18n script, supplemented by the glossary docs/.i18n/glossary.zh-CN.json and the translation memory docs/.i18n/zh-CN.tm.jsonl to ensure terminology consistency.

scripts/: 60+ Build And Operations Scripts

The scripts/ directory contains more than 60 independent script files, plus 198 npm scripts entries in package.json, forming an extremely sophisticated build automation system for OpenClaw. Scripts can be divided into the following categories according to their uses:

Build script: tsdown-build.mjs (main build entry), runtime-postbuild.mjs (post-build processing), bundle-a2ui.sh (Canvas A2UI packaging), ui.js (Web UI build)
Code inspection script: check-extension-plugin-sdk-boundary.mjs (extension import boundary check, three modes), check-plugin-extension-import-boundary.mjs (the core must not be reverse-imported into the extension), check-no-pairing-store-group-auth.mjs (security authentication audit)
Release script: openclaw-npm-release-check.ts (pre-release verification), plugin-npm-release-plan.ts (plugin release plan), openclaw-npm-postpublish-verify.ts (post-release verification)
Platform scripts: package-mac-app.sh (macOS packaging), ios-configure-signing.sh (iOS signing), build-release-aab.ts (Android AAB build)
Test scripts: test-parallel.mjs (parallel test orchestrator), test-live.mjs (real API Key test), 8 e2e/*.sh Docker E2E test scenarios
Operation and maintenance scripts: committer (atomic commit tool, replacing manual git add/commit), restart-mac.sh (macOS Gateway restart), clawlog.sh (macOS unified log query)

Dependency Profile: 47 Runtime And 22 Development Dependencies

OpenClaw's dependency control is extremely streamlined. The root package.json declares only 47 runtime dependencies and 22 development time dependencies. The version locking of key dependencies is as follows:

Dependencies	Version	Purpose
@mariozechner/pi-agent-core	0.64.0	Agent runtime core
@agentclientprotocol/sdk	0.17.1	ACP Protocol SDK
@modelcontextprotocol/sdk	1.29.0	MCP Protocol SDK
matrix-js-sdk	41.3.0-rc.0	Matrix Channel
playwright-core	1.58.2	Browser Control
sqlite-vec	0.1.9	Vector storage
sharp	^0.34.5	Image processing
hono	4.12.9	HTTP Framework
express	^5.2.1	Compatibility layer
zod	^4.3.6	Runtime verification
ws	^8.20.0	WebSocket
undici	^7.24.6	HTTP client

The most noteworthy development dependency is @typescript/native-preview@7.0.0-dev.20260331.1 - this is TypeScript's official Go language rewrite preview, which OpenClaw has integrated into the pnpm tsgo command. vitest@4.1.2 is paired with @vitest/coverage-v8 to provide coverage detection, tsdown@0.21.7 replaces webpack/rollup as the packager, oxfmt@0.43.0 and oxlint@1.58.0 replace Prettier and ESLint respectively. The selection idea of this tool chain is clear: replace the traditional solution written in JavaScript with native tools written in Rust/Go to obtain an order of magnitude performance improvement.

All dependencies with pnpm.patchedDependencies must use exact version numbers (the ^ or ~ prefix is not allowed), and dependency patches require explicit approval. Additionally, the repository explicitly states "Never update Carbon dependencies" - this is a hard rule written into AGENTS.md.

Core Directories

The previous chapter gave the first-level directory skeleton of src/. This chapter unfolds the internal design of each subdirectory one by one, taking the code structure and dependencies as the main line to explain the engineering layering of OpenClaw core source code.

src/cli/ — CLI command entry and progress rendering

src/cli/ is the entry layer for the entire OpenClaw command line tool. It does not contain any business logic and is only responsible for two things: parsing command line arguments and routing them to concrete implementations in src/commands/, and rendering structured progress feedback in the terminal.

The core of progress feedback is located in src/cli/progress.ts. This module uses two sets of mechanisms simultaneously:

The first set is OSC Progress Sequences (Operating System Command Progress Sequences). This is a set of terminal escape codes that allow a percentage progress bar to be displayed directly on the title bar or tab page in Windows Terminal, iTerm2, and some Linux terminals that support ConPTY. progress.ts drives an operating system-level progress indicator by writing the \x1b]9;4;1;{percent}\x07 sequence to stdout, which allows the user to see installation progress in the taskbar even when the terminal window is minimized.

The second set is @clack/prompts, a lightweight interactive terminal UI library. OpenClaw uses it to implement step indicators, multi-select menus, and confirmation prompts in onboard wizards. The spinner and OSC progress of @clack/prompts can work in parallel - the spinner is rendered on the current line of stdout, and the OSC sequence is rendered on the terminal title bar, and the two do not interfere with each other.

// src/cli/progress.ts Simplified core logic

import { spinner } from '@clack/prompts';

export function emitOscProgress(percent: number): void {

process.stdout.write(`\x1b]9;4;1;${Math.round(percent)}\x07`);

}

export function clearOscProgress(): void {

process.stdout.write(`\x1b]9;4;0;\x07`);

}

export async function withProgress(

label: string,

task: (update: (pct: number) => void) => Promise

): Promise {

const s = spinner();

s.start(label);

const result = await task((pct) => {

emitOscProgress(pct);

s.message(`${label} (${pct}%)`);

});

clearOscProgress();

s.stop(`${label} ✔`);

return result;

}

src/commands/ — Subcommands And The Onboard Wizard

Each file in src/commands/ corresponds to a top-level CLI subcommand. File naming follows the {command}.ts pattern, such as start.ts, stop.ts, update.ts, onboard.ts, config.ts, plugin.ts.

The most complex of these is onboard.ts, the first run wizard. The execution process of the Onboard wizard is: Detect the system environment (Node.js version, platform, package manager) → Select the message channel (Telegram/Discord/Slack, etc.) → Enter the channel credentials (Bot Token, etc.) → Select the AI Provider (OpenAI/Anthropic/Ollama, etc.) → Enter the Provider API Key → Write the configuration file ~/.openclaw/config.yaml → Execute npm install --omit=dev for the first time Install the selected channel's extended dependencies. The entire process is driven by @clack/prompts, with spinner and progress bar feedback for each step.

src/gateway/ — Gateway Control Plane

src/gateway/ is the backbone of OpenClaw. It starts a WebSocket service locally (listening to ws://127.0.0.1:18789 by default) and acts as a Single Control Plane between all channels, plugins, native clients and Control UI.

The directory structure is roughly as follows:

src/gateway/

├── server.ts # WebSocket server life cycle

├── router.ts # Protocol message distribution

├── session.ts # Session management

├── presence.ts # online status

├── config.ts # Runtime configuration Hot-reload

├── cron.ts # Scheduled tasks

├── webhooks.ts # External webhook access

├── auth.ts # Authentication model

├── health.ts # /healthz, /readyz endpoints

├── openai-compat.ts # /v1/models, /v1/embeddings compatibility layer

└── protocol/

├── schema.ts # Protocol Schema aggregation entry

└── schema/ # Schema definition files split by fields

├── sessions.ts

├── nodes.ts

├── channels.ts

└── ...

The protocol/ subdirectory is the type layer of Gateway. All WebSocket messages are serialized and deserialized via TypeScript type definitions exported by the protocol/schema.ts aggregate. Files within schema/ are organized by domain (sessions, nodes, channels, etc.), with each file exporting the request/response Zod Schema or TypeScript interface. This Schema is also used in Swift codegen - the Gateway client code in macOS/iOS native applications automatically generates corresponding Swift structs from these TypeScript types by the build script.

Session management (session.ts) maintains the memory status of all active sessions, including session ID, associated channel, associated Agent, message queue depth, last active time, etc. Presence (presence.ts) tracks the online status of all connected clients, supporting native applications and web UI to display which channels are online in real time. Cron (cron.ts) provides scheduled task scheduling based on cron expressions, which is used to periodically check the channel connection status and perform cleanup tasks. Webhooks (webhooks.ts) provides endpoint registration for channels that require HTTP callbacks, such as the Telegram webhook mode and the Slack Events API.

src/channels/ — Core Channel Implementations

src/channels/ is not a single directory - OpenClaw spreads the core channel code across multiple first-level directories under src/. The specific mapping relationship is:

Channel	Source code location	Underlying dependencies
Telegram	src/telegram/	grammY
Discord	src/discord/	discord.js
Slack	src/slack/	@slack/bolt
Signal	src/signal/	signal-cli (Java child process)
iMessage	src/imessage/	BlueBubbles HTTP API / native imsg
WhatsApp	src/web/	Baileys (WhatsApp Web Protocol)

src/channels/ itself exists as an aggregation layer, defining the Unified Messaging Abstraction interface and routing table that all channels must implement. The file structure inside each channel directory is roughly symmetrical: an adapter file is responsible for mapping the events of the platform SDK into a unified inbound message format, and a sender file is responsible for converting the unified outbound format back to platform-specific API calls.

src/routing/ — Message Routing Engine

The message routing engine (src/routing/) is the middle layer between the channel system and the Agent runtime. It distributes inbound messages to the correct Agent instance based on routing rules in the configuration file. Routing dimensions include: channel type, account ID, sender peer ID, group ID, and message content matching mode. In a multi-Agent scenario, the routing engine is responsible for isolating messages from different channels/accounts/groups into different Agent sessions.

src/plugins/ — Plugin Discovery, Loading, And Registration

src/plugins/ is the runtime host of the plugin system, not the plugin itself. It contains four core modules:

Discovery: Scan installed npm packages in extensions/ workspace and ~/.openclaw/plugins/ for packages that contain an openclaw.plugin.json manifest file.

Manifest Validation: Use Zod Schema to strictly verify the structure of openclaw.plugin.json. Fields such as id, channel.id, and install.npmSpec in the manifest file must conform to the predefined format.

Loader: Execute dynamic import() on the verified plugin, load its entry module and call the agreed life cycle hook.

Registry: Maintains a global plugin registry, recording the type, status, and capability statement of each loaded plugin. Registry supports runtime hot-plugging - newly installed plugins can be discovered → validate → load → register without restarting the Gateway.

Contract Enforcement: Ensuring at build time that plugins only import public APIs via openclaw/plugin-sdk/* via ESLint rules. Any plugins that directly reference modules inside src/ will be intercepted in CI.

src/plugin-sdk/ — The Only Legal Plugin Import Surface

src/plugin-sdk/ is the only public API side of OpenClaw for all external extensions. The exports field of package.json declares exactly 230 named export subpaths, each of which is a stable contract. These 230 subpaths are all legal import sources for plugin development - no exceptions. A detailed analysis of this catalog is provided in the next chapter.

src/infra/ — Infrastructure layer

src/infra/ encapsulates the underlying capabilities of interacting with the operating system. Core components include: a local persistence layer based on better-sqlite3 (to store session history, plugin status, user configuration, etc.), and a file locking mechanism based on proper-lockfile - ensuring that no two OpenClaw Gateway instances will operate on the same data directory at the same time on the same machine. The SQLite database file is located by default at ~/.openclaw/data/openclaw.db.

src/media/ — Media Processing Pipeline

src/media/ implements a unified media processing pipeline. When the channel receives an image, audio, video or file message, the pipeline is responsible for: downloading the original media → format detection → transcoding if necessary (such as Opus → WAV for speech to text) → storing in a local cache → generating a reference URL for use by the Agent. The pipeline is designed to be pluggable, and media plugins can register custom processors to handle specific MIME types.

Plugin SDK

OpenClaw's plugin system is centered on src/plugin-sdk/ and exposes 230 precisely named sub-paths to the outside through the exports field of package.json. This is a strictly designed Contract System - it also defines what the plugin can and cannot do.

The 230 Export Subpaths

The exports field format of package.json is as follows:

{

"exports": {

"./plugin-sdk/channel-types": "./src/plugin-sdk/channel-types.ts",

"./plugin-sdk/channel-inbound": "./src/plugin-sdk/channel-inbound.ts",

"./plugin-sdk/channel-reply-pipeline": "./src/plugin-sdk/channel-reply-pipeline.ts",

"./plugin-sdk/channel-send-result": "./src/plugin-sdk/channel-send-result.ts",

"./plugin-sdk/channel-dm-security": "./src/plugin-sdk/channel-dm-security.ts",

"./plugin-sdk/provider-types": "./src/plugin-sdk/provider-types.ts","./plugin-sdk/provider-registry": "./src/plugin-sdk/provider-registry.ts",

"./plugin-sdk/memory-core-types": "./src/plugin-sdk/memory-core-types.ts",

"./plugin-sdk/memory-core-store": "./src/plugin-sdk/memory-core-store.ts",

"./plugin-sdk/plugin-manifest": "./src/plugin-sdk/plugin-manifest.ts",

"./plugin-sdk/plugin-lifecycle": "./src/plugin-sdk/plugin-lifecycle.ts",

"./plugin-sdk/runtime-config": "./src/plugin-sdk/runtime-config.ts",

"./plugin-sdk/runtime-events": "./src/plugin-sdk/runtime-events.ts",

"./plugin-sdk/media-types": "./src/plugin-sdk/media-types.ts",

"./plugin-sdk/media-processor": "./src/plugin-sdk/media-processor.ts",

"./plugin-sdk/speech-types": "./src/plugin-sdk/speech-types.ts",

"./plugin-sdk/speech-engine": "./src/plugin-sdk/speech-engine.ts"

// ... 230 items in total

}

These 230 sub-paths can be divided into the following categories according to prefix:

Prefix	Quantity (approx.)	Responsibilities
channel-*	~45	Channel type definition, inbound/outbound messages, DM security policy, group behavior, chunking policy
provider-*	~35	AI Provider interface, model registration, capability declaration, streaming response protocol
memory-core-*	~20	Memory system core type, storage interface, vector index
plugin-*	~25	Plugin manifest format, life cycle hooks, capability declaration
runtime-*	~40	Runtime configuration, event bus, logs, error types, session context
media-*	~15	Media type, processor interface, transcoding pipeline
speech-*	~10	Speech recognition/synthesis engine interface
Others (tool-, skill-, util-*, etc.)	~40	Tool/skill plugin interface, common tool type

Import Boundary Rules

A core architectural constraint of OpenClaw is that all external extensions (packages in extensions/ workspace and third-party npm packages) can only be imported from openclaw/plugin-sdk/*. Direct references to internal modules in src/ are not allowed, relative paths are not allowed to cross package boundaries, and references to paths not declared in exports are not allowed.

This rule is enforced in CI via four custom ESLint rules:

Lint rules	Function
lint:extensions:no-plugin-sdk-internal	Prohibit code in extensions/ from importing the internal implementation files of plugin-sdk (non-exports declaration paths)
lint:extensions:no-relative-outside-package	Prohibit code in extensions/ from using relative paths to reference files outside the package
lint:extensions:no-src-outside-plugin-sdk	Prohibit code in extensions/ from directly referencing any module under src/ that is not plugin-sdk
lint:plugins:no-extension-imports	Prohibit src/ core code from back-referencing modules in extensions/ (to prevent reverse dependencies)

Together, these four rules form a strict Dependency Firewall: the boundary between core code and extension code is one-way, controlled, and auditable.

Five Plugin Types

OpenClaw defines five plugin types, each corresponding to a set of sub-paths in plugin-sdk:

Channel Plugin: Implements a new messaging platform adapter. Complete implementations of channel-inbound and channel-reply-pipeline must be provided. channel.id must be declared in the manifest file.

Provider Plugin: Connect to a new AI model provider. The interfaces defined in provider-types need to be implemented, including model enumeration, Chat Completion stream, Embedding, etc.

Tool Plugin: Adds new callable tools for Agent. Register tool definitions through tool-* subpaths, including JSON Schema parameter descriptions and execution functions.

Skill Plugin: A prepackaged composite capability (such as "search web pages and summarize") that can contain the orchestration logic of multiple tools.

Media Plugin: Register a custom media processor to handle files of specific MIME types.

Plugin Manifest: openclaw.plugin.json

The metadata of each plugin is declared by openclaw.plugin.json in the package root directory:

{

"id": "openclaw-channel-matrix",

"version": "2026.4.1",

"type": "channel","channel": {

"id": "matrix",

"displayName": "Matrix",

"supportsGroups": true,

"supportsDM": true

"install": {

"npmSpec": "@openclaw/channel-matrix@latest"

"minCoreVersion": "2026.3.1",

"entrypoint": "./dist/index.js"

}

Key fields: id is a globally unique identifier; channel.id must be provided when type is channel and is used for routing table matching; install.npmSpec specifies the npm package identifier used during installation; minCoreVersion declares the minimum compatible OpenClaw core version.

Local Barrel Files: api.ts And runtime-api.ts

There are two important Barrel Files inside src/plugin-sdk/: api.ts and runtime-api.ts.

api.ts aggregates all pure type exports - interface definitions, type aliases, enumerations, etc. It is a compile-time dependency and does not contain any runtime code. runtime-api.ts aggregates modules that need to be implemented at runtime - factory functions, registers, event emitters, etc. The separation of the two ensures that if the plugin only needs type information (such as pure TypeScript type guards), it can only rely on api.ts without introducing any runtime code, keeping it tree-shaking friendly.

Plugin Installation And Dependency Constraints

Plugin installation is performed through npm install --omit=dev, and only production dependencies are installed. Key constraint: The use of workspace:* protocols as dependencies is prohibited in the plugin's package.json - this is because third-party plugins are not in the monorepo workspace context when installed on the user's machine, and workspace:* will fail to resolve. There are special checking scripts in CI to intercept such violations.

Deprecation Of Legacy Provider Compatibility Paths

v2026.3.31 is a Breaking Change version. Previously, a set of legacy subpaths prefixed with provider-compat-* were retained in the plugin-sdk for backward compatibility with earlier Provider interfaces. v2026.3.31 officially removed these paths. Third-party Provider plugins that rely on the old interface must be migrated to the new provider-* subpaths. The migration guide is located at docs/migration/v2026.3.31-provider-compat.md.

Gateway Architecture

Gateway is the core runtime process of OpenClaw. It is not an optional component - all channel messages, Agent dispatch, plugin communication, and native client interactions are routed through the Gateway. To understand Gateway is to understand the full runtime of OpenClaw.

A Single Local Control Plane

Gateway's design philosophy is Single Local Control Plane - there is only one instance of Gateway running on the local machine, which is the communication hub for all components. The startup command openclaw start actually starts the Gateway process. Gateway listens for WebSocket connections on ws://127.0.0.1:18789 (the default port), while providing an HTTP endpoint on the same port.

All components are clients of Gateway: channel adapters (Telegram bot, Discord bot, etc.) internally report inbound messages to Gateway through WebSocket; Agent receives tasks from Gateway during runtime and returns responses; native applications (macOS, iOS, Android) connect to Gateway through WebSocket to obtain real-time status; Control UI (Web interface) is also a WebSocket client.

Typed protocol: protocol/schema

Gateway's WebSocket protocol is fully typed. The protocol definition is located in src/gateway/protocol/schema.ts, which aggregates and exports all submodules from the src/gateway/protocol/schema/ directory. Each sub-module corresponds to a protocol field:

// src/gateway/protocol/schema/sessions.ts

import { z } from 'zod';

export const SessionPatchRequest = z.object({

method: z.literal('sessions.patch'),

params: z.object({

sessionId: z.string(),

patch: z.object({

thinkingLevel: z.enum(['off','minimal','low','medium','high','xhigh']).optional(),

activeAgent: z.string().optional(),

queueMode: z.enum(['sequential','parallel']).optional(),

}),

});

export const SessionPatchResponse = z.object({

result: z.object({

sessionId: z.string(),

applied: z.record(z.unknown()),

}),

});

// src/gateway/protocol/schema/nodes.ts

export const NodeListRequest = z.object({

method: z.literal('node.list'),

params: z.object({

filter: z.object({

type: z.enum(['channel','agent','plugin','tool']).optional(),

status: z.enum(['online','offline','error']).optional(),

}).optional(),

}),

});

export const NodeDescribeRequest = z.object({

method: z.literal('node.describe'),

params: z.object({ nodeId: z.string() }),

});

export const NodeInvokeRequest = z.object({method: z.literal('node.invoke'),

params: z.object({

nodeId: z.string(),

action: z.string(),

payload: z.unknown(),

}),

});

The protocol adopts a JSON-RPC-like request/response pattern. Core methods include:

Method	Purpose
sessions.patch	Modify session parameters (thinking level, active agent, queue mode, etc.)
sessions.list	List all active sessions and their status
node.list	List all registered nodes (channels, agents, plugins, tools)
node.describe	Get detailed information and capability statement of the specified node
node.invoke	Send operation instructions to the specified node (such as requiring the channel to send messages, requiring the Agent to perform tasks)

Swift Codegen

macOS/iOS native apps need to communicate with Gateway. To ensure consistency between TypeScript protocol definitions and Swift client code, OpenClaw includes a Swift codegen step in the build process. The build script parses the Zod Schema in src/gateway/protocol/schema/ and automatically generates the corresponding Swift Codable struct and enum. The generated code is located in apps/macos/Generated/ and apps/ios/Generated/. This means that protocol changes only need to modify the TypeScript Schema, and the Swift side will automatically synchronize, without the risk of manual synchronization missing.

Authentication Model

Gateway supports three authentication modes, arranged in order of priority:

trusted-proxy: Gateway trusts requests from specific proxies (such as Nginx, Cloudflare Tunnel) and identifies them based on the HTTP header injected by the proxy. This is the recommended mode for production environments.

local-direct: When the WebSocket connection comes from 127.0.0.1, skip authentication and authorize directly. This is the default behavior for local development and standalone deployment.

gateway token: A static Token set through the configuration file, carried by the client through the Authorization: Bearer header during the WebSocket handshake. Used for remote access scenarios.

v2026.3.31 introduces an important security change: In trusted-proxy mode, Gateway will refuse the connection if multiple clients are detected using the same shared-token. The previous configuration of "multiple people sharing one token" is not recommended but works. The new version upgrades it to a hard error. This is because the sessions of different users cannot be distinguished in the shared-token scenario, which will lead to confusing message routing.

OpenAI Compatible Endpoints and Health Checks

Gateway exposes a set of OpenAI compatible endpoints at the HTTP layer:

/v1/models: Returns a list of all available models in the current configuration, in a format compatible with the OpenAI List Models API. This allows any OpenAI API-compatible client (such as Cursor, Continue, etc.) to directly use OpenClaw Gateway as a model provider.

/v1/embeddings: Provides text vectorization interface, format compatible with OpenAI Embeddings API. The backend can be routed to the actual configured Embedding Provider (OpenAI, Ollama native model, etc.).

The health check endpoint follows Kubernetes conventions:

/healthz: Liveness Probe, returns 200 as long as the Gateway process is running.

/readyz: Readiness Probe, returns 200 only when at least one channel connection is successful and the Agent runtime has been initialized. Used by load balancers to determine whether a node can receive traffic.

Control UI and Bridge Protocol

Gateway directly serves a Web management interface——Control UI. The UI is developed using Lit 3 (Web Components framework) + Vite (build tool), and the source code is located in the ui/ directory. The build product is embedded in the static resources of Gateway when released and can be accessed directly through HTTP (default http://127.0.0.1:18789). Control UI itself is also a WebSocket client, maintaining a long connection with Gateway to achieve real-time status updates.

The specification document of Bridge Protocol is located at docs/gateway/bridge-protocol.md, which defines the communication convention between native applications and Gateway - including message encoding format, heartbeat mechanism, reconnection strategy, and event subscription model. This document is the core reference for native app developers.

Channel System

OpenClaw supports 24 messaging channels in v2026.4.1. The core engineering challenge of the channel system is: how to abstract 24 messaging platforms with different characteristics and API styles into a unified set of inbound/outbound messaging models while retaining the unique capabilities of each platform.

The 24 Supported Channels

Channel	Underlying implementation	Type
WhatsApp	Baileys (WhatsApp Web Reverse Protocol)	Core Channel (src/web/)
Telegram	grammY	Core Channel (src/telegram/)
Slack	@slack/bolt	Core Channel (src/slack/)
Discord	discord.js	Core channel (src/discord/)
Signal	signal-cli (Java child process)	Core Channel (src/signal/)
BlueBubbles (iMessage)	BlueBubbles HTTP API	Core channel (src/imessage/), recommended method
iMessage (legacy imsg)	Native AppleScript/osascript	Core channel, marked legacy
Google Chat	Google Chat API	Built-in extensions
IRC	irc-framework	Built-in extensions
Microsoft Teams	Teams SDK (v2026.3.28 upgraded version)	Built-in extensions
Matrix	matrix-js-sdk + @matrix-org/crypto-wasm	Built-in extensions (extensions/)
Feishu	Feishu Open API	Built-in extensions
LINE	@line/bot-sdk	Built-in extensions
Mattermost	Mattermost REST API + WebSocket	Built-in extensions
Nextcloud Talk	Nextcloud Talk API	Built-in extensions
Nostr	nostr-tools	Built-in extensions
Synology Chat	Synology Chat Webhook	Built-in extensions
Tlon	Tlon API	Built-in extensions
Twitch	tmi.js	Built-in extensions
Zalo	Zalo Official Account API	Built-in extensions (extensions/)
Zalo Personal	Zalo Personal API (ZaloUser)	Built-in extensions (extensions/)
Voice Call	VoIP/SIP integration	Built-in extensions (extensions/)
WeChat (WeChat)	@tencent-weixin/openclaw-weixin (iLink Bot API)	Official cooperation plugin
WebChat	Gateway built-in WebSocket chat	Core channels

Channel Contract Types

The type contract of the channel system is defined by three core documents:

types.plugin.ts: Public types for plugin developers. The interfaces that channel plugins must implement are defined here, including ChannelAdapter (channel adapter), ChannelSender (message sender), and ChannelConfig (channel configuration Schema).

types.core.ts: core internal types, not exported through plugin-sdk. Contains routing table entries, session binding relationships, and internal message envelope (Envelope) format.

types.adapters.ts: Adapter auxiliary type, which defines the mapping interface from each platform's SDK events to a unified inbound format.

Unified Messaging Abstraction

Unified messaging abstraction is the core design of the channel system. It is defined by three plugin-sdk subpaths:

channel-inbound: Defines a unified structure for all channel inbound messages. Regardless of whether the message comes from WhatsApp, Telegram or Discord, it is converted to the same InboundMessage type after being processed by the channel adapter. This type includes: channelId, peerId (sender ID), groupId (group ID, null for DM), content (text/media/mixed content), replyTo (reference message ID), timestamp, rawEvent (platform raw event, used for channel-specific logic).

channel-reply-pipeline: Defines the processing pipeline through which Agent responses pass. The pipeline stages include: content formatting (Markdown → platform-specific formatting) → long message chunking (per-channel chunking) → media attachment processing → platform API calls.

channel-send-result: Defines the unified structure of message sending results, including the message ID returned by the platform, sending status (success/failure/partial success), and error information.

Group routing: @mention gating and reply tags

In a group scenario, the Agent does not respond to all messages by default - this can lead to noise in the group. OpenClaw implements @mention gating (Mention Gating): Only when the message contains @mention to the Bot, the Agent will process the message. This behavior can be overridden by configuration to always mode (response to all messages) per channel/group.

Reply Tags solve another group problem: when multiple messages come in at the same time, the Agent's reply needs to mark its corresponding original message. This is implemented in Telegram via reply_to_message_id, in Discord via Message Reference, and in Slack via Thread TS. The channel adapter is responsible for mapping the unified replyTo field to the platform-specific reply mechanism.

Long message chunking (Per-channel Chunking) is another platform difference processing point. Telegram’s single message limit is 4096 characters, Discord’s is 2000 characters, and WhatsApp’s is about 65536 characters. The chunking stage in channel-reply-pipeline splits overly long Agent responses into multiple messages based on the constraints of the target channel, while ensuring that structures such as code blocks, Markdown lists, etc. are not truncated in the middle.

DM Security Policy

The private messaging (DM) scenario has an independent security model, defined by the channel-dm-security subpath. The core is the dmPolicy configuration item, which supports three modes:

pairing: Users must send a pairing code before activating a DM conversation. The pairing code is generated by the openclaw pair command and is used once. This is the safest mode.

allowlist: Only user IDs/mobile phone numbers listed in the allowFrom configuration can initiate DM conversations.

open: Anyone can start a DM conversation directly. It is only recommended for use in a controlled environment (such as intranet deployment).

Channel-specific highlights

WhatsApp: Based on Baileys library, using WhatsApp web protocol. The first connection requires scanning the QR code to complete the login. The QR code is rendered in the form of ASCII art in the terminal and displayed in the form of a picture in the Control UI. Session credentials are persisted to the local file system, and automatic recovery is initiated later.

Telegram: Supports two operating modes - Long Polling (default) and Webhook mode. In Webhook mode, Gateway registers a public HTTPS endpoint to receive Telegram push, which has lower latency but requires an address reachable by the public network (usually implemented through Cloudflare Tunnel or ngrok). The grammY framework provides a complete type-safe encapsulation of the Bot API.

Discord: Supports two interaction modes: native Slash Commands (/ask, /image, etc.) and plain text commands. discord.js provides a rich event model, and OpenClaw leverages its Message Component capabilities to implement interactive buttons and selection menus.

Microsoft Teams: Version v2026.3.28 includes a major upgrade to Teams integration, moving to a new version of the Teams SDK. The new version supports streaming replies (Streaming Replies). The Agent's replies can stream into a Teams conversation in real time with AI annotation tags, making it clear that the message came from AI.

WeChat (WeChat): Implemented through official cooperation channels, using the @tencent-weixin/openclaw-weixin package and accessing the iLink Bot API at the bottom. Currently only private messages are supported, group chats are not supported. The v2.x version requires OpenClaw core version ≥ 2026.3.22.

Agent Runtime

OpenClaw's Agent runtime is built on top of the Pi Agent. This is not a self-developed Agent framework, but a deep integration of external libraries @mariozechner/pi-agent-core@0.64.0 and @mariozechner/pi-ai@0.64.0. The Pi ecosystem also includes pi-coding-agent (a special agent for code generation) and pi-tui (terminal UI).

RPC Mode: Tool Streaming And Block Streaming

When Pi Agent runs in RPC mode, it supports two streaming output protocols:

Tool Streaming: When the Agent calls a tool, the execution process and intermediate results of the tool are returned in a streaming manner. For example, when the Agent calls the search tool, each search result is pushed as a stream event, instead of waiting for all results to be returned before outputting them all at once.

Block Streaming: The agent's text response is streamed out in blocks. A "block" can be a paragraph, a block of code, or a list. Block streaming is more suitable for message channel scenarios than token-by-token streaming - the channel adapter can send each block immediately when it is completed, instead of accumulating the entire response and sending it, and also avoids the frequent API calls caused by token-by-token sending.

Session Model

OpenClaw's Session model is key to understanding message routing. Each Agent maintains multiple independent sessions (Session), and the sessions are completely isolated:

DM Session: Conversations with each DM user constitute an independent session. A session is uniquely identified by the (agentId, channelId, peerId) triplet.

Group Session: Each group has an independent session, identified by the (agentId, channelId, groupId) triplet. Group conversations are completely isolated from DM conversations - Agents cannot see the private chat history of the same user in the group, and vice versa.

The activation mode of the session controls when the Agent responds: in mention mode, only @mention triggers a response; in always mode, all messages trigger a response. The default is always for DM conversations and mention for group conversations.

Queue Mode controls the processing strategy of concurrent messages: in sequential mode, messages are processed one by one in strict order of reception; in parallel mode, multiple messages can be processed in parallel (suitable for stateless tool calling scenarios).

Reply-back routing ensures that the Agent's response is sent to the correct channel and conversation. When the Agent triggers a cross-channel operation through a tool call (such as asking the Agent to send a message to a Slack channel in a Telegram conversation), the reply-back route is responsible for routing the operation results back to the Telegram conversation that initiated the request.

Session Tools: Inter-Agent Coordination

Three built-in tools enable Agents to have cross-session/cross-Agent coordination capabilities:

// sessions_list: List all currently active sessions

{

description: 'List all active sessions with their channel, peer, and status',

parameters: {

filter: { type: 'object', properties: {

channelId: { type: 'string' },

status: { enum: ['active', 'idle', 'archived'] }

}}

}

// sessions_history: Read the historical messages of the specified session

{

description: 'Read message history from a specific session',

parameters: {

sessionId: { type: 'string' },

limit: { type: 'number', default: 50 }

}

// sessions_send: Send messages to the specified session (implementing Agent-to-Agent communication)

{

description: 'Send a message to a specific session (enables agent-to-agent coordination)',

parameters: {

sessionId: { type: 'string' },

content: { type: 'string' }

}

sessions_send is the key to multi-Agent coordination. Agent A can discover Agent B's session through sessions_list, and send instructions or queries to Agent B through sessions_send. Agent B's response will return Agent A's session context through reply-back routing.

Multi-Agent Routing

OpenClaw supports running multiple Agents in the same instance, and each Agent has independent configuration and session space. Routing rules are defined in the configuration file and support distributing inbound messages to different Agents according to the three dimensions of channel, account, and peer:

# config.yaml Multi-Agent routing example

agents:

- id: general-assistant

provider:openai

model: gpt-4o

routes:

- channel: telegram

account: "@mybot"

- channel: discord

account: "bot-token-1"

- id: coding-helper

provider: anthropopic

model: claude-sonnet-4-20250514

routes:

- channel: slack

account: "workspace-1"

peers: ["U12345678"] # Only messages from specific users are routed to this Agent

Each Agent has independent workspace and session storage to achieve complete isolation.

Agent Workspace and injection files

The runtime context for each Agent is provided by the ~/.openclaw/workspace/ directory. Three special files in this directory will be automatically injected into the Agent’s system prompts:

AGENTS.md: Defines the Agent's roles, behavioral guidelines, and constraints. This is the core definition file for the Agent personality.

SOUL.md: More fine-grained personality description - tone, conversation style, knowledge field preferences, etc.

TOOLS.md: Tool usage guide, telling Agent the usage scenarios and best practices of each available tool.

These three files are all in Markdown format and can be edited freely by users. No need to restart after modification - Gateway will check the mtime of the file before processing each session message, and reload it if there is any change.

Session persistence, pruning and compression

Session history is persisted to ~/.openclaw/agents//sessions/*.jsonl in JSONL (JSON Lines) format. One file per session, one message per line. The JSONL format was chosen carefully: it supports append-only writing (crash safety), supports line-wise incremental reading (memory efficiency), and can be inspected directly with standard text tools.

Long-running sessions can accumulate large amounts of history, causing context window overflow and increased latency. OpenClaw provides two coping mechanisms:

Session Pruning: Automatically delete old messages that are older than a configured time window (default 7 days). The pruning operation is triggered when the session is activated and is lazy.

Session Compaction: Triggered manually via the /compact command. The compression process calls an AI model to summarize the long history into a condensed contextual summary, replacing the original message-by-message record. Compressed session files can be reduced in size by more than 80% while retaining key contextual information.

Thinking Levels and Idle-stream Timeout

OpenClaw exposes granular control over the "depth of thinking" of AI models. The thinkingLevel parameter supports six levels:

Level	Behavior
off	Disable Extended Thinking and generate responses directly
minimal	Minimum thinking budget
low	Low thinking budget, suitable for simple tasks
medium	Medium thinking budget, default
high	High thinking budget, suitable for complex reasoning
xhigh	Extremely high thinking budget, used in scenarios that require deep reasoning

Thinking Level can be adjusted dynamically at the session level through the sessions.patch protocol method, or a global default can be set in the configuration file. Providers that support extended thinking (such as Anthropic Claude) will adjust the budget limit of thinking tokens based on the level.

idle-stream timeout introduced in v2026.3.31 solves a practical operation and maintenance problem: when the Model Stream does not output new tokens for a long time (for example, the model server is stuck or the network is interrupted), the Agent will wait without releasing the session lock, causing all subsequent messages of the session to accumulate. The idle-stream timeout allows you to configure a timeout (default 120 seconds). When the stream has no new data within the specified time, the Agent will actively interrupt the stream and return a partial response or error message. This timeout is adjustable per Provider in the configuration file - a longer timeout may be required when using native Ollama models.

Memory System

The personalization capabilities of an AI assistant depend on the depth of the memory system. OpenClaw's memory subsystem memory-core is the most detailed part of the module split in the entire project. It consists of 13 sub-modules, all exported through plugin-sdk. The design goal is clear: all memory data is persisted in the form of local Markdown files and SQLite databases, which can be directly edited by users, can be controlled by Git, and can be run offline.

Division of responsibilities of 13 sub-modules

There are 13 memory-related export paths in plugin-sdk, and each path corresponds to an independent compilation unit:

Export path	Responsibilities
memory-core	Root module, defines MemoryStore interface, MemoryEntry type, TTL policy and serialization contract
memory-core-engine-runtime	When the engine is running, bind memory operations to the current Agent runtime life cycle
memory-core-host-engine-embeddings	Embedding engine host: schedules Embedding model calculation vectors and manages batch embedding queues
memory-core-host-engine-foundation	Basic engine host: Provides tokenizer binding, vector dimension negotiation, and distance metric selection
memory-core-host-engine-qmd	QMD (Query-Memory-Document) engine: Semantically matching user queries with memory documents
memory-core-host-engine-storage	Storage engine host: abstracts the underlying storage backend (SQLite, LanceDB) and provides unified CRUD
memory-core-host-multimodal	Multimodal memory: processing the indexing and retrieval of non-text memory items such as pictures and audio
memory-core-host-query	Query host: Build semantic search queries, combining keyword filtering and vector similarity
memory-core-host-runtime-cli	CLI runtime host: Exposing terminal commands such as openclaw memory search
memory-core-host-runtime-core	Core runtime host: memory system initialization, migration and life cycle management
memory-core-host-runtime-files	File runtime host: monitor changes in Markdown memory files and trigger re-indexing
memory-core-host-secret	Key host: manages encryption keys stored in memory and SecretRef parsing
memory-core-host-status	Status host: reports index progress, number of vectors, recent query latency and other operating indicators

This split method follows OpenClaw's plugin architecture principles: each sub-module can be replaced or disabled independently, and the core system only relies on the interface defined by the memory-core root module and does not directly rely on any specific storage backend.

Local Markdown file persistence

OpenClaw's memory system stores user preferences and long-term context as local Markdown files, located by default in the ~/.openclaw/memory/ directory. Each memo file is standard Markdown with YAML front-matter metadata:

---

type: preference

created: 2026-03-15T08:22:00Z

updated: 2026-04-01T14:30:00Z

tags: [coding-style, language]

---

# Coding Preferences

- Preferred language: TypeScript with strict mode

- Tab width: 2 spaces

-Always use explicit return types

- Prefer functional composition over class inheritance

The core advantages of this design lie in three points: first, users can directly modify the memory content using any text editor without entering the OpenClaw interface; second, memory files can be included in Git version control, and team members can share and synchronize preference configurations; third, memory content is completely available offline and does not rely on any cloud services. The memory-core-host-runtime-files module detects changes in Markdown files through file system monitoring (fs.watch) and automatically triggers the re-indexing process - parsing front-matter, extracting text, calculating embedding vectors, and updating vector storage.

Vector storage: sqlite-vec and LanceDB dual backend

Semantic search relies on vector storage. OpenClaw provides two backend options:

sqlite-vec (version 0.1.9) is the default backend. It is a vector search extension for SQLite, declared in package.json as an npm dependency sqlite-vec@0.1.9. sqlite-vec stores vectors as BLOB columns in SQLite tables, supporting exact nearest neighbor (Exact KNN) and quantization-based approximate nearest neighbor (ANN) searches. For individual use cases - typically with memory entries on the order of hundreds to thousands - sqlite-vec's exact KNN is efficient enough, with query latencies in the sub-millisecond range. The advantages of sqlite-vec are fully consistent with OpenClaw's local-first philosophy: single-file database, zero external dependencies, straightforward backup and migration.

memory-lancedb is the second backend, also exported through plugin-sdk. LanceDB is an embedded vector database. The bottom layer uses Lance columnar format and supports IVF-PQ index. It is suitable for scenarios where memory entries reach hundreds of thousands. The memory-core-host-engine-storage module isolates the two backends through a unified storage abstraction layer, and the upper-layer code does not need to be aware of the underlying implementation differences:

// memory-core-host-engine-storage abstract interface

export interface VectorStorageBackend {

insert(entries: MemoryEntry[]): Promise;

search(query: Float32Array, topK: number, filter?: MemoryFilter): Promise<ScoredEntry[]>;

delete(ids: string[]): Promise;

count(): Promise;

vacuum(): Promise;

}

Embedded pipeline and semantic search

memory-core-host-engine-embeddings manages the complete pipeline of embedded computations. When a memory file is created or modified, this module performs the following process:

Parse the Markdown file, divide the text into paragraphs (chunking), and control each block within 512 tokens
Calling the currently configured Embedding Model to calculate vectors, using the embedding endpoint configured in the provider plugin by default
Write vectors to vector storage together with metadata (source file path, chunk offset, timestamp, label)
Maintain an incremental index: only recalculate embeddings for changed blocks, and retain the original vectors for unmodified blocks

memory-core-host-engine-qmd (QMD engine) is responsible for semantic matching during query time. The full name of QMD is Query-Memory-Document, which implements a three-stage retrieval process: first calculate the embedding vector for the user query, then perform an approximate nearest neighbor search in the vector storage to obtain the candidate set, and finally use BM25 keyword scoring to re-rank the candidate set. The memory-core-host-query module is responsible for constructing query objects and combining conditions such as semantic similarity thresholds, tag filtering, and time ranges into unified query descriptors.

The memory system is the cornerstone of OpenClaw's personalization capabilities. When the Agent runtime processes each round of dialogue, it will retrieve relevant memories through memory-core-engine-runtime and inject them into the system prompt words. This process is transparent to the user, but directly affects how personalized the Agent’s responses are—it knows the user’s preferred programming language, coding style, common tool chains, and even project context established in past conversations.

Model Providers

OpenClaw's Model Provider system is the core of its multi-model support capabilities. There are more than 25 provider plugins exported through plugin-sdk, covering mainstream commercial APIs, open source inference engines, and cloud platform gateways. Each provider is an independent plugin and follows a unified registration, authentication and model directory protocol.

Provider Plugin Architecture

Each provider plugin consists of four core files:

File	Responsibilities
provider-entry.ts	Plugin entry point, register the provider to the plugin registry, declare supported functional features (Feature Flags)
provider-auth.ts	Authentication logic, implementing API Key or OAuth process
provider-catalog-shared.ts	Model directory, listing all models supported by this provider and their capability tags (text/images/code, etc.)
provider-model-shared.ts	Model sharing configuration, defining metadata such as token restrictions, pricing information, context window size, etc.

Provider plugins are registered via the export path of plugin-sdk. Taking the OpenAI provider as an example, the export path is plugin-sdk/provider-openai, Anthropic is plugin-sdk/provider-anthropic, and so on.

Full provider list

As of v2026.4.1, plugin-sdk exports the following providers:

Provider	Model example	Authentication method
OpenAI	GPT-4o, o3, o4-mini, Codex	API Key / OAuth
Anthropic (Claude)	Claude Sonnet 4, Opus 4	API Key / OAuth
Google (Gemini)	Gemini 2.5 Pro, Flash	API Key
DeepSeek	DeepSeek-V3, DeepSeek-R1	API Key
xAI (Grok)	Grok-3, Grok-3-mini	API Key
Ollama	Deploy any GGUF model locally	None (local)
Mistral	Mistral Large, Codestral	API Key
MiniMax	MiniMax-Text-01, image-01	API Key
Moonshot (Dark Side of the Moon)	Kimi	API Key
ModelStudio (Tongyi Qianwen)	Qwen-Max, Qwen-Plus	API Key
Qianfan (Baidu Wenxin)	ERNIE-4.0, ERNIE-Speed	API Key
NVIDIA	Nemotron, Llama 3 NVIDIA	API Key
HuggingFace	Inference API Hosting Model	API Token
Together	Llama, Mixtral and other open source models	API Key
Venice	Privacy-first reasoning	API Key
vLLM	Self-hosted vLLM instance	Custom
SGLang	Self-hosted SGLang instance	Custom
BytePlus (Volcano Engine)	Big bean bag model	API Key
Cloudflare AI Gateway	Workers AI Agent	API Token
Amazon Bedrock	Claude on Bedrock, Titan	AWS IAM
Anthropic Vertex	Claude on Vertex AI	GCP Service Account
Chutes	GPU inference market	API Key
KiloCode	KiloCode model	API Key
Kimi Coding	Kimi code model	API Key
OpenCode / OpenCode Go	Open source code reasoning	API Key

Authentication system: dual-mode authentication and key management

The authentication subsystem consists of four modules: provider-auth-api-key (API Key authentication), provider-auth-login (OAuth login authentication), provider-auth-result (authentication result encapsulation) and provider-auth-runtime (runtime authentication status management).

Most providers support API Key single mode, but mainstream providers such as OpenAI and Anthropic support both OAuth login and API Key. In OAuth mode, after the user completes authorization through the browser, OpenClaw obtains the access token and automatically manages the refresh process. This dual-mode design (Auth Rotation) allows users to seamlessly switch to their own API Key after the free quota is used up, and vice versa.

Synthetic Auth is implemented through the resolveSyntheticAuth function. When multiple providers share the same underlying credentials (for example, Anthropic Vertex uses GCP credentials instead of the Anthropic native API Key), synthetic authentication converts the underlying credentials into the format expected by the provider. The implementation is located in the authentication runtime module:

//Synthetic authentication parsing in provider-auth-runtime

export async function resolveSyntheticAuth(

provider: ProviderId,

secretStore: SecretStore

): Promise {

const secretRef = getProviderSecretRef(provider);

const rawCredential = await secretStore.resolve(secretRef);

//Convert credential format based on provider type

switch (provider) {

case 'anthropic-vertex':

return synthesizeVertexAuth(rawCredential as GCPServiceAccount);

case 'amazon-bedrock':

return synthesizeBedrockAuth(rawCredential as AWSCredentials);

default:

return { type: 'api-key', key: rawCredential as string };

}

SecretRef is OpenClaw's credential reference semantics. Credentials are not stored in clear text in the configuration file, but instead reference the operating system's keychain (macOS Keychain, Windows Credential Manager, Linux Secret Service) through a SecretRef. The format of SecretRef is secretref:

:, resolved to the actual credential value at runtime by the memory-core-host-secret module.

Model failover

Model Failover is OpenClaw's core mechanism to deal with API rate limits and service interruptions (see docs.openclaw.ai/concepts/model-failover for details). When the primary model returns a 429 (Rate Limited) or 5xx error, the system automatically routes the request to a preconfigured alternative model. Failover configuration is defined in the user's settings file:

{

"models": {

"primary": "anthropic:claude-sonnet-4-20260514",

"fallback": [

"openai:gpt-4o",

"google:gemini-2.5-pro"

"failover": {

"maxRetries": 2,

"retryDelayMs": 1000,

"fallbackOnRateLimit": true,

"fallbackOnServerError": true

}

The failover logic is implemented at the routing layer (src/routing/) and is transparent to the upper-layer Agent runtime. The routing layer maintains health status and rate limiting windows for each provider, trying alternative models in fallback list order when the primary model is unavailable.

v2026.3.28 new features

The v2026.3.28 version introduces three important changes to the provider system:

xAI migration to Responses API: The xAI provider migrated from the traditional Chat Completions API to the Responses API format, while enabling x_search native web search functionality. Grok models can call xAI's search infrastructure directly in the conversation, without the need for an additional layer of tool calls.

MiniMax image generation: The MiniMax provider has added support for the image-01 model, which enables venison plot capabilities through MiniMax's image generation API. This feature is registered as a Provider-owned Tool, following the OpenClaw design principle - provider-specific tools and settings belong to the provider plugin, not the core system.

Tongyi Qianwen authentication changes: Qwen’s portal auth mode has been removed and switched to Model Studio API Key authentication. This is a breaking change and existing portal auth users will need to manually migrate to API Key mode.

GitHub Copilot login support

OpenClaw supports GitHub Copilot account login through two export modules, plugin-sdk/github-copilot-login and plugin-sdk/github-copilot-token. Users with Copilot subscriptions can directly use GitHub account authentication to access underlying models (GPT-4o, Claude, etc.) through Copilot's infrastructure without the need to configure each provider's API Key separately. The authentication process reuses GitHub's Device Flow OAuth, and after obtaining the Copilot token, the github-copilot-token module manages the token refresh.

ACP Protocol

Agent Client Protocol (ACP) is a stateful Agent session protocol defined by OpenClaw. The core idea of ACP is to decouple AI Agent interaction from a specific chat interface, allowing it to start and manage stateful Agent work sessions through any communication channel (Discord, iMessage, terminal, etc.). The project relies on @agentclientprotocol/sdk@0.17.1 to provide the core type and client implementation of the protocol.

ACPX: Headless CLI Tool

ACPX (repository openclaw/acpx, 1,834 stars) is a headless ACP CLI client for OpenClaw. It allows users to create, manage, and interact with ACP sessions from the command line, without the need for a graphical interface. Typical usage scenarios for ACPX include Agent automation in CI/CD pipelines, server-side deployment, and script orchestration.

ACP channel binding

ACP sessions can be bound to any chat channel. With the /acp spawn codex --bind here command, the user can create an ACP session in the current channel context. Currently supported bindings include:

Discord: Through Discord Bot channel binding, ACP sessions are mapped to Discord threads
BlueBubbles: iMessage bridge on macOS, ACP sessions access iMessage via BlueBubbles API
iMessage: Direct iMessage binding (macOS/iOS only)

The core layering of ACP requires a clear distinction between three concepts: Chat Surface is the UI layer for user interaction, which can be a Discord channel, terminal window or web interface; ACP Session is a stateful Agent Interaction context, maintains conversation history, workspace status, and tool authorization; Runtime Workspace is the file system sandbox where the Agent actually performs operations. A chat surface can be associated with multiple ACP sessions, and each ACP session is bound to a unique runtime workspace.

MCP integration and tool bridging

OpenClaw integrates Model Context Protocol (MCP) and relies on @modelcontextprotocol/sdk@1.29.0. MCP defines a standard communication protocol between AI models and external tools, and OpenClaw exposes external MCP tool servers to the Agent runtime through the MCP bridge layer.

v2026.3.31 introduces a critical security change for ACPX plugin-tools MCP bridging: MCP tools are off by default (explicit default-off) and must be explicitly enabled in the configuration. This change stems from the security considerations of Trust Boundary Hardening - external MCP tool servers may execute arbitrary code, and enabling it by default will expand the attack surface. Enable configuration example:

{

"mcp": {

"servers": {

"filesystem": {

"command": "npx",

"args": ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/allowed/dir"],

"enabled": true

}

"trustPolicy": "prompt-per-tool"

}

The trustPolicy supports three levels: prompt-per-tool (user confirmation is required for each tool call), prompt-once (automatic trust after first confirmation), and trust-all (full trust, only recommended for use in controlled environments).

OpenAI apply_patch is enabled by default

For OpenAI and Codex series models, OpenClaw enables the apply_patch tool by default. This is a code editing tool natively supported by the OpenAI Codex model. It returns structured patch instructions directly through the API, which are modified by the runtime execution file of OpenClaw. Compared with letting the model output the complete file content and then doing diff, apply_patch reduces the output token consumption and reduces the error rate when editing large files. apply_patch's sandbox permissions are aligned with write permissions - In a non-master session's Docker sandbox, apply_patch's write scope is subject to the same constraints as normal file writes.

CLI backend plugin: Claude CLI / Codex CLI / Gemini CLI

v2026.3.31 Migrate the default behavior of inference for the three main CLI backends - Claude CLI, Codex CLI and Gemini CLI - to their respective bundled plugin. Through the Plugin SDK's cli-backend and cli-runtime export paths, the CLI backend can register custom inference flows, tool exposures, and session management policies. The significance of this migration is decoupling - the core no longer hard-codes the behavior of the CLI backend, and third-party plugins can register custom CLI backends through the same interface.

ACP and Agent-to-Agent communication

The deep value of ACP is reflected in the Agent-to-Agent (A2A) communication capability. OpenClaw's Session toolset—sessions_list, sessions_history, sessions_send—allows one Agent session to discover, query, and send messages to another Agent session. sessions_send supports optional reply-back mode (ping-pong communication) and announce steps, allowing structured coordination conversations between Agents.

In multi-Agent deployment scenarios (for example, one Agent is responsible for customer conversations and another Agent is responsible for back-end task execution), A2A communication avoids the complexity of requiring external message queues in traditional architectures. All communication is routed through the Gateway's WebSocket control plane, and agents share the same runtime infrastructure but have isolated session contexts and workspaces.

The new ACP channel binding in v2026.3.31 further extends this capability: /acp spawn codex --bind here can directly bind the current chat surface as a Codex-driven workspace without creating a child thread. This way, users can launch a coding agent directly in a Discord channel, and the agent's output appears directly in the conversation flow.

Media Pipeline

OpenClaw's media processing pipeline is located in the src/media/ directory and is responsible for the preprocessing, understanding and life cycle management of all non-text content. Three core modules are exported through plugin-sdk: media-runtime (runtime pipeline scheduling), media-understanding (media content understanding interface) and media-understanding-runtime (runtime binding of understanding modules). There is also web-media that exports media-specific logic that handles web channels.

Image processing: sharp pipeline

Image processing depends on sharp@0.34.5 - the highest performance image processing library in the Node.js ecosystem, which uses libvips at the bottom. OpenClaw uses sharp to perform the following processing:

Resize: Scale the image uploaded by the user to the maximum resolution supported by the model to avoid wasting tokens or exceeding API limits
Format conversion: uniformly convert BMP, TIFF, WebP and other formats to JPEG or PNG to ensure that all providers can receive it
Metadata stripping: remove privacy data such as geographical location and device information from EXIF information
Thumbnail generation: Generate low-resolution previews for UI display

File type detection uses file-type@22.0.0, which determines the file type based on the Magic Number instead of the file extension to prevent malicious file camouflage.

PDF processing

PDF processing depends on pdfjs-dist@5.6.205 (the npm distribution of Mozilla PDF.js). The processing flow includes text extraction, page rendering to images (for visual understanding of multimodal models), and structured content parsing. For large PDFs, OpenClaw implements a pagination handling strategy—extracting only page ranges relevant to the context of the current session, rather than loading the entire document at once.

Audio and video processing

Audio and video processing pipelines process multimedia files uploaded by users or audio streams captured via voice input. Audio processing includes format conversion (unification to WAV/MP3), sample rate normalization and silence detection. The Transcription hook converts audio input into text and integrates it into the Agent's conversation flow - voice messages are automatically transcribed and processed as text messages, and the Agent can selectively reply in voice or text.

Video processing adopts a key frame extraction strategy: extract key frames from the video at fixed intervals or scene change detection, and send them as image sequences to the multi-modal model for understanding, avoiding the high computational cost of processing the complete video stream.

Size limit and temporary file life cycle

Each channel (Channel) can independently configure the maximum size of media files. For example, the configuration of the Discord channel:

{

"channels": {

"discord": {

"mediaMaxMb": 25

"web": {

"mediaMaxMb": 100

"cli": {

"mediaMaxMB": 500

}

Files that exceed the limit are rejected in the pre-processing stage and do not enter the subsequent stages of the pipeline. Temporary files (intermediate products during processing) follow strict life cycle management: each media processing task creates an independent temporary directory, which is cleaned up after the processing is completed regardless of success or failure. The media-runtime module maintains a temporary file registry and performs cleanup-on-exit when the process exits to prevent disk leaks.

Media understanding and multi-modal access

The two SDK export paths media-understanding and media-understanding-runtime define the interface and runtime implementation of media content understanding. Media understanding is more than just format conversion—it transforms images, documents, and audio into structured input that models can consume. For images, the understanding pipeline extracts text (OCR) from images and identifies objects and scenes; for PDFs, it generates page summaries and structured paragraph indexes; for audio, it outputs timestamped transcripts.

The output format of multimodal understanding follows the requirements of each model provider. OpenAI's GPT-4o and Anthropic's Claude Sonnet 4 accept base64-encoded images embedded in the message body; Google Gemini supports larger media files that can be referenced after uploading through the File API. The responsibility of media-understanding-runtime is to select the optimal encoding and transmission strategy based on the currently active model providers.

Readability reader for media pipelines

OpenClaw integrates @mozilla/readability@0.6.0 (Mozilla's readability extraction library) and linkedom@0.18.12 (a lightweight DOM implementation) for extracting body text from web content. When the Agent uses a browser tool to access a web page, after the original HTML is parsed by linkedom, the Readability algorithm extracts the core text content and strips away noisy elements such as navigation bars, advertisements, and sidebars. The extracted plain text enters the Agent's context window, significantly reducing token consumption compared to injecting original HTML.

Markdown rendering is handled by markdown-it@14.1.1. Before being sent to each channel, the Markdown formatted reply output by the Agent is formatted according to the capabilities of the target channel: Discord natively supports Markdown, Telegram supports some Markdown subsets, WhatsApp uses WhatsApp-style text formatting, and SMS/iMessage is reduced to plain text.

Voice System

OpenClaw's speech system covers the complete link from voice wake-up to speech synthesis. plugin-sdk exports three speech modules: speech (public interface), speech-core (core implementation) and speech-runtime (runtime binding). The voice function is divided into four forms according to the platform and interaction mode.

Voice Wake: macOS/iOS wake word

Voice Wake (see docs.openclaw.ai/nodes/voicewake for details) is the wake word feature for macOS and iOS platforms. The device continuously monitors ambient audio and activates the Agent session after detecting the preset wake word. Wake word detection runs locally on the device and does not send an audio stream to the cloud - consistent with OpenClaw's local-first principle.

Message forwarding after waking up is implemented through VoiceWakeForwarder. After the user's voice is converted into text through local speech recognition, VoiceWakeForwarder calls OpenClaw's CLI interface to pass the text to the Agent:

1	openclaw-mac agent --message "${text}" --thinking low

The implementation of VoiceWakeForwarder requires special handling of Shell Escaping: the user's voice transcript may contain special Shell characters such as quotation marks, dollar signs, and backtick marks. Direct splicing into the command line may lead to injection risks or parsing errors. The forwarder performs strict shell escaping of the text before passing it on. The --thinking low parameter instructs the Agent to use a low-latency thinking mode, giving priority to response speed rather than reasoning depth, and adapting to the real-time requirements of voice interaction.

Talk Mode: Android continuous voice

Talk Mode (see docs.openclaw.ai/nodes/talk for details) is the Android platform's continuous voice conversation mode. Different from Voice Wake's "wake up → single interaction" mode, Talk Mode maintains a continuously open voice channel - the user and Agent can have multiple rounds of voice conversations without re-awakening each round. Talk Mode uses VAD (Voice Activity Detection) to automatically determine the start and end of the user's speech to achieve a natural conversation rhythm.

Push-to-Talk: macOS overlay

The macOS platform also provides Push-to-Talk mode, which operates as a system-level overlay. The user activates the microphone input by long pressing the shortcut key, and then ends the recording and sends it after releasing it. This mode is suitable for asking quick questions in a desktop workflow without switching to an OpenClaw window. The overlay uses AppKit's NSPanel implementation and is set up to float above all windows.

TTS: ElevenLabs And System Fallback

Speech synthesis (TTS, Text-to-Speech) adopts a two-layer strategy. The preferred solution is ElevenLabs's API, which provides high-quality, low-latency, multi-language speech synthesis. When ElevenLabs is unavailable (the network is offline or the API Key is not configured), the system automatically falls back to the platform's native TTS: macOS uses AVSpeechSynthesizer, iOS uses AVSpeechSynthesizer (same framework), and Android uses android.speech.tts.TextToSpeech.

In addition, OpenClaw integrates node-edge-tts@1.2.10 as a third-layer TTS backend. Edge TTS calls the online TTS service of Microsoft Edge browser. It is free and supports multi-language and multi-sound. It is a practical intermediate option in scenarios where there is no ElevenLabs subscription but there is a network connection.

Voice Call plugin and closed-loop testing

The Voice Call plugin is packaged in the extensions/ directory and distributed with OpenClaw as a built-in extension. It implements a complete voice call function - users can have real-time voice conversations with the Agent like making a phone call, and the two-way audio stream is transmitted through WebRTC or the platform's native audio framework.

The quality assurance of voice calls relies on Closed-Loop Testing. The test script is executed through test:voicecall:closedloop npm script, and the process is as follows: automatically generate test text → TTS synthesizes into audio → audio is fed to the voice call pipeline as input → Agent processes and generates a reply → TTS synthesizes reply audio → transcribes the reply audio into text → compares the semantic consistency of the original text and the reply content. This end-to-end closed loop eliminates the uncertainty of manual testing and ensures that every link in the voice pipeline (ASR → Inference → TTS) works properly.

# Execute voice call closed-loop test

pnpm test:voicecall:closedloop

# Test process:

# 1. Generate test prompt

# 2. TTS synthesizes input audio

# 3. Inject audio into the voice call pipeline

# 4. Wait for Agent response

# 5. Capture TTS output audio

# 6. ASR transcription output

# 7. Assertion: The output text matches the expected semantics

The entire voice system embodies OpenClaw's pursuit of multi-terminal consistency: the same Agent can receive voice input in four ways: wake word, continuous voice, button to talk or voice call, and output voice replies in three ways: ElevenLabs, Edge TTS or system native TTS. All combinations behave consistently on various platforms. Speech capabilities are not an add-on feature, but a first-class interaction mode on par with text channels.

Native Multi-Platform Apps

OpenClaw's multi-port strategy is not a simple WebView wrapper. The three native clients of macOS, iOS, and Android each bear differentiated responsibilities: the macOS application is the developer's local console and debugging center, the iOS application is a lightweight node (Node) on the mobile terminal, and the Android application is oriented to the widest range of device command groups. The three communicate unifiedly through the Gateway WebSocket protocol to achieve cross-platform node registration, command dispatch and canvas synchronization.

13.1 macOS App: Menu Bar Hub

The source code of the macOS application is located in apps/macos/. It adopts the SwiftUI + AppKit hybrid architecture and uses the menu bar resident icon as the interactive entrance. In the OpenClaw internal vocabulary, the code name for macOS applications is makeup (short for "mac app").

The core functions of the application cover the following aspects:

Gateway health monitoring: The menu bar icon reflects the status of the Gateway process in real time, including the number of connections, memory usage and heartbeat delay. The panel that pops up by clicking the icon provides a one-click restart entry. The restart of the Gateway must be performed through the OpenClaw Mac application itself or the scripts/restart-mac.sh script, rather than manually in tmux - the latter will bypass the process monitoring chain and lead to inconsistent status.

Voice wake and Push-to-Talk floating layer: Voice Wake continuously monitors the wake word, and the PTT (Push-to-Talk) overlay resides on the desktop in the form of a translucent floating window. The two together form the macOS native entrance to voice interaction.

WebChat embedding and debugging tool: The embedded WebChat view supports real-time conversations with Gateway, while exposing the debugging panel for viewing message flow, tool call logs and token consumption.

SSH tunnel remote control: macOS applications can connect to remotely deployed Gateway instances through SSH tunnels and control cloud services in the local menu bar.

13.1.1 SwiftUI state management: Observation framework

The state management of macOS applications has been fully migrated to the Observation framework introduced in Swift 5.9. The @Observable macro is used to mark observable types and @Bindable is used to implement property-level two-way binding. The legacy ObservableObject / @StateObject / @Published patterns have been explicitly deprecated — any remaining legacy usage should be migrated to the new framework. The advantage of the Observation framework is more fine-grained dependency tracking: SwiftUI only re-renders the view when the property that is actually read changes, rather than the overall notification mode of ObservableObject.

// Correct: Observation framework

@Observable

final class GatewayMonitor {

var isConnected = false

var latencyMs: Int = 0

var sessionCount: Int = 0

}

// Error: Deprecated old mode, do not use

// class GatewayMonitor: ObservableObject {

// @Published var isConnected = false

// }

13.1.2 Signature Build and TCC Permissions

macOS apps require signed builds for system permissions to persist across recompilations. Unsigned development builds will trigger a TCC (Transparency, Consent, and Control) permission reset pop-up window after each rebuild. The packaging script is located in scripts/package-mac-app.sh and is responsible for code signing, notarization (Notarization) and DMG encapsulation.

The system capabilities exposed by macOS Node Mode are controlled through TCC permission mapping:

Node command	Function	TCC permissions
system.run	Execute local command and return stdout/stderr/exit code	needsScreenRecording flag
system.notify	Send user notification	notifications
canvas.*	Canvas operation routing	screen-recording
camera.*	Camera capture	camera

13.1.3 Unified log system

The logs of macOS applications are uniformly queried through the scripts/clawlog.sh script. The underlying system uses the Unified Logging system of macOS and supports filtering by subsystem. Common operations:

# Track all OpenClaw subsystem logs in real time

./scripts/clawlog.sh --follow

# Filter by category

./scripts/clawlog.sh --category networking --tail 100

# View specific subsystems

./scripts/clawlog.sh --subsystem ai.openclaw.gateway

13.2 iOS App: Mobile Node

The iOS application source code is located in apps/ios/ and is a standard Xcode project + SwiftUI project. Different from macOS applications, iOS applications are positioned to run as remote nodes of Gateway, automatically pairing with Gateway instances in the LAN through the Bonjour Device Discovery (Device Discovery) mechanism, and establishing persistent connections through Gateway WebSocket.

Core capabilities provided by iOS nodes:

Canvas Surface: Renders Agent-driven canvas content on iOS devices, supporting touch interaction.

Voice Wake forwarding: The voice wake detection results on the iOS side are forwarded to the Gateway through WebSocket to achieve touch-free voice activation on the mobile side.

Talk Mode: The voice interaction mode of long pressing and speaking, the audio stream is directly transmitted to the Gateway for recognition and processing.

Camera Snap/Clip: Supports taking snapshots and collecting short video clips for use by the Agent's visual capabilities.

Screen Recording: Perform screen recording through ReplayKit and send the recording content to the Agent as context.

The version number of iOS applications is maintained in two locations: apps/ios/Sources/Info.plist and apps/ios/Tests/Info.plist. The key fields are CFBundleShortVersionString (display version number) and CFBundleVersion (build number). Both files must be updated simultaneously when publishing.

13.3 Android App: Full Device Node

Android apps are located in apps/android/ and are built using Kotlin + Gradle. Compared with iOS applications, Android nodes expose a richer set of device command families (Device Command Families), taking full advantage of the openness of the Android platform.

The app UI is organized into three main tabs:

Tab	Function
Connect	Device pairing entrance supports two methods: Setup Code and manual input
Chat Sessions	Conversation list and chat interface
Voice	Voice interaction control panel

The device command group supported by the Android node is the richest among the three terminals:

Instruction family	Ability
notifications	Read/Send system notifications
location	GPS positioning and geofencing
SMS	Reading and sending text messages
photos	Album access and photo upload
contacts	Read and write address book
calendar	Calendar event management
motion	Accelerometer, gyroscope and other sensor data
app update	Application self-update management

In addition, the Android side also supports Canvas rendering, camera capture and screen recording capabilities.

13.3.1 Android build and test commands

# Unit testing (Play Debug variant)

./gradlew :app:testPlayDebugUnitTest

# Third-party integration testing

./gradlew :app:testThirdPartyDebugUnitTest

# Kotlin code style check

./gradlew :app:ktlintCheck :benchmark:ktlintCheck

# Release AAB build

bun apps/android/scripts/build-release-aab.ts

The version information is defined in versionName (display version) and versionCode (numeric incremental version) in apps/android/app/build.gradle.kts.

13.4 Cross-Platform Node Protocol

The three native applications communicate with the Gateway through the unified Gateway WebSocket protocol. Core commands related to nodes include:

Command	Direction	Function
node.list	Gateway → Client	Enumerate all connected nodes and their capability statements
node.describe	Gateway → Node	Query the detailed capability description and parameter schema of the specified node
node.invoke	Gateway → Node	Execute the command on the specified node and return the results

The system permissions of the macOS platform are controlled through the TCC framework, covering screen-recording, notifications, camera, and location. Each permission is bound to a specific node command capability, and the application will request authorization the first time it is called.

Session-level privilege escalation is controlled via the /elevated on|off command. When enabled, the current session gains full bash access; when closed, it falls back to the restricted execution surface. This command is independent per session and does not affect other concurrent sessions.

13.4.1 v2026.3.31 security enhancement

v2026.3.31 introduces two breaking changes related to node security. First, node commands are no longer automatically enabled after Device Pairing is completed—node commands must be exposed to the Agent only after explicit Node Pairing Approval. Device pairing only establishes a WebSocket connection channel, while node pairing approval confirms the user's authorized intent to expose the device's capabilities.

Second, the runs initiated by the node (Node-originated Runs) are restricted to the reduced trusted execution surface (Reduced Trusted Surface). Even if the node itself has full capabilities, the execution flow actively triggered from the node side can only use a predefined subset of security tools.

Live Canvas

Live Canvas is an Agent-driven visualization workspace hosted by OpenClaw Gateway. Unlike traditional static output, Live Canvas is a persistent interactive screen - Agent can push content, reset state, execute scripts, and capture snapshots on it. The cross-end rendering of Canvas is implemented by native applications (macOS, iOS SwiftUI, Android), and the canvas control logic is unified and abstracted through the A2UI (Agent to UI) protocol.

14.1 A2UI Protocol and Construction

A2UI defines the protocol specification for Agent to send control instructions to the UI layer. The A2UI implementation for Canvas host is located in the src/canvas-host/a2ui/ directory. The implementation is packaged as a standalone bundle that is loaded by Gateway at runtime and injected into the Canvas host container.

Bundle build products are version tracked through the hash file src/canvas-host/a2ui/.bundle.hash (automatically generated and should not be edited manually). The build command has two equivalent forms:

# via pnpm script

pnpm canvas:a2ui:bundle

# Via shell script

scripts/bundle-a2ui.sh

The building of the A2UI bundle is the first step in the overall pnpm build pipeline. The complete build pipeline is:

pnpm build

# Equivalent to:

# 1. pnpm canvas:a2ui:bundle

# 2. tsdown-build.mjs

# 3. runtime-postbuild.mjs

The vendor source code of A2UI is maintained in the vendor/a2ui directory, and the shared encapsulation layer of the native side is located in apps/shared/OpenClawKit/Tools/CanvasA2UI.

14.1.1 Cross-compilation considerations

The build of the A2UI bundle may fail in a cross-compilation environment. A typical scenario is building an amd64 target via QEMU on Apple Silicon - in which case A2UI's build step may crash due to QEMU's incomplete emulation of certain instruction sets. This has been protected in the Dockerfile: when the A2UI bundle fails to build, a stub file will be created instead to ensure that the overall build of the Docker image will not be interrupted. This means that the image produced by QEMU cross-compilation may not contain full Canvas functionality.

14.2 Canvas operation primitives

Canvas' operation model consists of four core primitives:

Operation	Semantics	Typical uses
canvas.push	Append content (HTML/JS/CSS fragments) to Canvas	Incrementally build UI interface
canvas.reset	Clear Canvas and reinitialize	Switch context or reset state
canvas.eval	Execute arbitrary JavaScript in the Canvas context	Dynamic interaction logic, data visualization
canvas.snapshot	Capture a visual snapshot of the current Canvas	Record status and generate screenshot feedback

The security positioning of canvas.eval needs special explanation. It is classified as an Operator Control Surface - meaning its security is the responsibility of the deployment operator (Operator), not the OpenClaw platform itself. Agent can execute arbitrary JavaScript code through canvas.eval, which gives great flexibility, but it also means that Operator must establish corresponding security lines in its own deployment environment.

14.3 Canvas and Node Mode

In multi-end architecture, Canvas operations are exposed through node mode. All canvas.* calls are routed as node.invoke instructions and sent to the corresponding end-side node for execution. This means that the Agent can specify rendering of canvas content on a specific device (such as the user's iPad or Android phone), enabling cross-device visual workflow orchestration.

The three platforms have different implementations of Canvas rendering: macOS uses WebKit views, iOS uses SwiftUI native view layer combined with WebKit rendering, and Android uses Android WebView. But the upper-layer A2UI protocol ensures that the Agent does not need to care about the underlying rendering differences.

A2UI build pipeline and cross-compilation

A2UI's bundle file is located in src/canvas-host/a2ui/a2ui.bundle.js, and its hash is recorded in src/canvas-host/a2ui/.bundle.hash (automatically generated and should not be edited manually). The build command is pnpm canvas:a2ui:bundle or scripts/bundle-a2ui.sh. In the complete build pipeline of pnpm build, the A2UI bundle is executed as the first step.

Cross-compilation is a known pain point. When building amd64 images on Apple Silicon, the A2UI bundle may fail due to limitations of the QEMU emulation environment. The Dockerfile handles this graceful degradation: when the bundle fails, it creates a stub file (containing the comment /* A2UI bundle unavailable in this build */), and cleans up the vendor/a2ui and apps/shared/OpenClawKit/Tools/CanvasA2UI directories at the same time to ensure that the build will not be interrupted due to A2UI being unavailable. CI builds are executed on the native architecture and therefore are not affected by this.

Safe positioning of Canvas

Safe positioning of a Canvas is key to understanding its design boundaries. canvas.eval allows Agent to execute arbitrary JavaScript code in Canvas, which is functionally equivalent to a browser-side eval(). OpenClaw explicitly categorizes this as the Operator control plane - similar to script execution in browser automation tools, the security responsibility lies with the deployer rather than the platform. This positioning is consistent with OpenClaw's single-user, local-first architecture: on the user's own device, the Agent already has the same permissions as the user. However, in multi-tenant or public deployment scenarios, the Operator must evaluate the risks caused by Canvas eval and make appropriate restrictions.

Lobster

Lobster is a workflow orchestration shell in the OpenClaw ecosystem. The separate repository is openclaw/lobster (992 stars). Its positioning slogan is "OpenClaw-native workflow shell" — a workflow execution environment designed natively for OpenClaw.

15.1 Typed JSON Pipeline

The core abstraction of Lobster is Typed JSON Pipelines (Typed JSON Pipelines). Unlike the Unix shell's text pipeline, the data flowing in the Lobster pipeline is a JSON structure with type constraints. Each pipeline step declares its input schema and output schema, and Lobster can perform type checking during the pipeline assembly stage instead of exposing type mismatch issues at runtime.

The pipeline architecture is composable (Composable Pipeline Architecture): developers can link OpenClaw's Skills and Tools into multi-step workflows. Each step can be a Skill call, a Tool execution, a piece of custom logic, or a nested sub-pipeline.

The key flow control mechanism is Approval Gates (Approval Gates). Approval points can be inserted between any steps in the pipeline, where execution is paused and waits for confirmation from a designated approver (human or other agent) before moving forward. This is critical for automated processes that involve sensitive operations—for example, in a deployment pipeline where the code compilation step is automated but requires human approval before being pushed to production.

15.2 Terminal Color Palette

Lobster has strict requirements for visual consistency of terminal output. Color definitions are concentrated in the src/terminal/palette.ts module, which exports a shared set of terminal color palettes (Terminal Color Palette). All output to the terminal—including onboarding boot flows, config prompts, and TTY UI output—must reference color constants defined in the palette, and hardcoding color values in your code is strictly prohibited.

// src/terminal/palette.ts

export const palette = {

primary: chalk.hex('#5B8DEF'),

success: chalk.hex('#6BCB77'),

warning: chalk.hex('#FFD93D'),

error: chalk.hex('#FF6B6B'),

muted: chalk.gray,highlight: chalk.bold.white,

// ...more color definitions

} as const;

This design ensures Lobster's visual consistency across different terminal emulators and color schemes, while simplifying theme customization.

15.3 Caclawphony: Autonomous Execution of Orchestration

Caclawphony (repo openclaw/caclawphony, 34 stars) is a Symphony system built on top of Lobster. Its core capability is to decompose project-level tasks into mutually isolated autonomous execution units (Isolated Autonomous Execution Runs). Each execution unit has an independent context, toolset and sandbox environment, and multiple units can run in parallel.

Caclawphony is suitable for scenarios where work needs to be divided and conquered, such as large-scale project refactoring and batch code migration. The project manager (human or agent) defines the task decomposition strategy at the top level, and Caclawphony is responsible for converting it into a collection of Lobster pipelines that can be executed in parallel.

Caclawphony complements the Session tool in the OpenClaw main repository: sessions_send provides point-to-point communication between Agents, while Caclawphony provides a task-level orchestration framework - it cares about "which work units require parallel/serial execution" rather than "how Agent A sends messages to Agent B". The combination of the two forms a complete Agent collaboration stack from a single conversation to complex project execution.

Lobster’s design philosophy

Lobster's choice of name (lobster) was not arbitrary. In the OpenClaw conceptual system, the lobster symbolizes two engineering concepts: First, the lobster's claws represent tools - each step in the Lobster pipeline is a tool call that can be run independently; second, the lobster's molt represents version evolution - the pipeline can replace the internal implementation while keeping the external interface unchanged.

Another key difference between Lobster pipelines and Unix pipelines is the error handling semantics. In Unix pipes, non-zero exit codes from upstream commands are ignored downstream (unless set -o pipefail is set). Each step of the Lobster pipeline must explicitly declare its error handling strategy: fail-fast (any error immediately terminates the entire pipeline), retry (retry with exponential backoff), skip (log the error but continue execution), or fallback (switch to an alternative step). This explicit error handling semantics makes the Lobster pipeline far more reliable than shell scripts.

15.4 The relationship between Lobster and Cron

The Lobster pipeline can be combined with OpenClaw's Cron scheduling system to achieve timing automation. Cron is responsible for triggering timing, and Lobster is responsible for executing logic. Typical applications include: executing code quality scanning pipelines every early morning, generating project status report pipelines every week, delaying execution of cleanup pipelines after specific events are triggered, etc. The Cron trigger passes the Lobster pipeline ID as the execution payload, and Gateway's scheduler is responsible for instantiating the pipeline and starting execution at the specified time.

Web UI and Browser Control

Control UI: Lit 3 + Vite

OpenClaw's web management interface—Control UI—is hosted and distributed directly by the Gateway process, eliminating the need for a separate front-end server. The UI source code is located in the ui/ directory and is built using Lit 3 (Google's Web Components library), using Vite as the development server and build tool. The build command is pnpm ui:build, and the output is embedded in the static resource path of Gateway.

The choice of Lit over React/Vue/Svelte reflects OpenClaw's engineering preferences: Lit is based on the Web Components standard, does not require a virtual DOM runtime, the resulting bundle is extremely small, and is naturally compatible with Gateway's native HTTP service. The functions of Control UI cover session management, channel status monitoring, configuration editing, Skills management and Agent interaction. The UI building system supports the signal (Signals) responsive mode and implements fine-grained UI updates through @lit-labs/signals@0.2.0 and signal-utils@0.21.1.

UI also has a separate test pipeline pnpm test:ui, and a dedicated lint rule lint:ui:no-raw-window-open to prevent the use of raw window.open() in UI code (safety wrappers provided by the framework should be used).

WebChat: Conversational interface based on Gateway WebSocket

WebChat (see docs.openclaw.ai/web/webchat for details) is a conversational interface embedded in Control UI that directly uses Gateway's WebSocket connection - no independent WebChat port or additional configuration is required. After installing the Gateway, the user can start a conversation with the Agent by visiting http://localhost:18789 in the browser.

WebChat is also an embedded Web view of macOS App, loaded directly through the WebKit view of macOS. This kind of architecture reuse ensures that the conversation experience on the web and macOS is consistent.

Browser control tools: Playwright + exclusive Chromium

OpenClaw's browser control tool (see docs.openclaw.ai/tools/browser for details) is one of the most complex modules in the core tool system. It uses playwright-core@1.58.2 to control a dedicated Chromium instance through CDP (Chrome DevTools Protocol) - not the user's daily browser, but an independent instance managed by OpenClaw with an independent browser profile (Profile).

The core capabilities of browser control include:

Page Snapshots: Capture the DOM status and visual rendering of the page for the Agent to analyze the page content
Structured actions (Actions): click, fill in forms, scroll, navigate - Agent drives the browser through structured instructions instead of injecting free JavaScript
File upload: Agent can instruct the browser to upload specified files in the file picker
Multi-Profile Isolation: Different browser profiles can be used for different tasks to maintain the isolation of cookies and login status

Configuration for browser tools is declared via JSON:

{

"browser": {

"enabled": true,

"color": "#FF4500"

}

The color parameter controls the title bar color of the browser window - this is a design detail that allows users to quickly distinguish it from their daily browser through color when the Agent-controlled browser window appears on the screen.

When building a Docker image, you can pass --build-arg OPENCLAW_INSTALL_BROWSER=1 to pre-install Chromium and Xvfb (X Virtual Frame Buffer), which increases the image size by about 300MB, but saves the 60-90 seconds of Playwright installation time each time the container is started. This is especially important for CI/CD scenarios.

Tool system overview

OpenClaw's First-class Tools are a direct extension of the platform's core capabilities, different from third-party tools that are accessed through Skills or MCP. First-class tools are integrated directly into the Gateway and Agent runtimes, with full security policy and sandbox support:

Tools	Ability	Documentation
Browser	Exclusive Chromium control, CDP snapshot, structured operations, file upload	docs.openclaw.ai/tools/browser
Canvas	A2UI driven visual workspace (push/reset/eval/snapshot)	docs.openclaw.ai/platforms/mac/canvas
Nodes	Device-side operations: camera snap/clip, screen record, location.get, notifications	docs.openclaw.ai/nodes
Cron	Scheduling of scheduled tasks and automatic triggering	docs.openclaw.ai/automation/cron-jobs
Sessions	sessions_list / sessions_history / sessions_send (inter-Agent communication)	docs.openclaw.ai/concepts/session-tool
Webhooks	Receive external HTTP callbacks and trigger Agent processing	docs.openclaw.ai/automation/webhook
Gmail Pub/Sub	Gmail email arrival event driven	docs.openclaw.ai/automation/gmail-pubsub
Discord/Slack Actions	Platform native interaction (slash commands, buttons, drop-down menus)	Channel document embedded

In sandbox mode, the availability of tools is strictly restricted. In the Docker sandbox of non-main sessions, tools that are allowed by default include bash, process, read, write, edit, and sessions series; tools that are disallowed by default include browser, canvas, nodes, cron, discord, and gateway. This double-layer control of whitelist + blacklist ensures safe isolation in multi-tenant scenarios.

Security Model

OpenClaw's security model covers the complete link from message entry to execution environment. This chapter starts with the access control of DM (Direct Message) pairing, passes through the execution boundary of sandbox isolation, and ends with security infrastructure and credential management, systematically dismantling the security architecture of OpenClaw.

16.1 DM Pairing and Access Control

OpenClaw's default DM security policy (dmPolicy) is set to "pairing". In this mode, any DM session initiated by an unknown sender will receive a pairing code (Pairing Code), and the user needs to confirm it through the CLI on the server side to establish a trust relationship.

Comparison of three DM strategy modes:

Mode	Behavior	Security level
pairing (default)	Unknown sender receives pairing code, administrator approval is required	High
allowlist	Only users in the whitelist can initiate DM	High
open	Accept all DMs (allowFrom: "*" needs to be configured at the same time)	Low

Pair approval is done via CLI command:

1	openclaw pairing approve <code>

The allow list for each channel is configured independently via the allowFrom field. For example, channels.telegram.allowFrom and channels.discord.allowFrom control the access lists of Telegram and Discord channels respectively.

Public DM access requires two explicit authorizations: dmPolicy="open" and the "*" wildcard in the allowFrom array. Setting up just one of these does not open up public access—this is an intentional double-gating design to prevent configuration mistakes that could lead to accidental exposure.

openclaw doctor will proactively detect and alert for risky or incorrect DM policy configurations, including but not limited to: missing allowFrom wildcards in open mode, empty whitelist in allowlist mode and other anomalies.

The legacy configuration key name channels.discord.dm.policy has been migrated to channels.discord.dmPolicy. Old formats are still recognized in the current version, but will trigger deprecation warnings.

16.2 Sandbox Isolation

OpenClaw's sandbox policy is configured through agents.defaults.sandbox.mode. The recommended default value is "non-main", which means that non-main sessions (group sessions, channel sessions, etc.) automatically enter a sandbox isolation environment.

This design is based on OpenClaw's Single-user design assumption: the operator of the main session (Main Session) is the owner of the service and has full host access rights, and the tool is executed directly on the host. Instead of the main session coming from an external user, each session executes in a separate Docker sandbox container, completely isolated from each other and the host machine.

The availability of tools in the sandbox is controlled by both whitelist and blacklist:

Category	Tool list
Sandbox whitelist (allowed use)	bash, process, read, write, edit, sessions_list, sessions_history, sessions_send, sessions_spawn
Sandbox blacklist (use prohibited)	browser, canvas, nodes, cron, discord, gateway

Tools in the blacklist that are called within a sandbox session will return an explicit permission denial error instead of being silently ignored.

16.3 Security Infrastructure

OpenClaw's security policy documents live in the separate repository openclaw/trust (35 stars) and are published at trust.openclaw.ai. That repository contains the full Threat Model documentation. Security vulnerability reports are received at security@openclaw.ai.

16.3.1 SSRF Protection

OpenClaw Plugin SDK exports the ssrf-runtime module for use by the plugin when making network requests. This module verifies the target address and blocks access to the intranet address (RFC 1918), loopback address, link local address, and cloud metadata endpoint, thereby preventing SSRF (Server-Side Request Forgery) attacks. All plugin network calls should be routed through this module, rather than using the fetch or http modules directly.

16.3.2 Position on Prompt Injection

OpenClaw officially declares Prompt Injection as Out of Scope — it is not considered a security vulnerability. This position is based on practical considerations: there is no reliable defense against prompt injection under the current LLM architecture, and including it in the vulnerability scope will only create a false security promise. Accordingly, canvas.eval and browser script execution are classified as Operator control planes, and the security boundary is defined by the deployer.

16.4 Plugin installation security

The security control of the plugin installation process has undergone significant strengthening between v2026.3.28 and v2026.3.31.

The before_install hook in the installation process provides an integration point for security scanners. Any external security scanning tool can be registered as a before_install handler to check the plugin code before it is shipped.

v2026.3.31 breaking changes: The built-in dangerous code detector now implements the Fail Closed (Fail Closed) policy by default for "critical" level findings. Previously, critical-level findings only generated warnings, which administrators could choose to ignore. Under the new policy, findings marked critical will directly prevent installation. To force installation of a marked plugin, an explicit override parameter must be used:

1	openclaw plugin install --dangerously-force-unsafe-install

The verbosity of this parameter name is intentional—to make each use deliberate enough to avoid misoperation. Both Skills installations and Plugins installations are subject to the same scanning gating.

16.5 Gateway Authentication Strengthening (v2026.3.31)

v2026.3.31 has made several tightenings on Gateway’s authentication mechanism:

trusted-proxy mode Rejects Mixed Shared-token Configs. If multiple services are detected sharing the same authentication token, Gateway will refuse to start and report a configuration conflict.

local-direct fallback mode now requires explicit configuration of the token. Previously, connections on the same host could be implicitly authenticated (Implicit Same-host Auth), which was risky in multi-tenant deployment scenarios. The new version removes this implicit trust and all connections must provide a valid token.

Node Pairing Approval becomes a mandatory prerequisite - node commands are not exposed until pairing approval is completed. Node-originated Runs are restricted to a reduced trusted execution surface.

16.6 Credential Management

OpenClaw's credentials are stored in the ~/.openclaw/credentials/ directory. Credential refresh for the web service provider re-executes the OAuth process via the openclaw login command.

Key references in the Provider plugin use SecretRef semantics - only the reference identifier of the key is stored in the configuration file rather than the clear text value, which is resolved by the credential manager to the actual key at runtime. This design ensures that configuration files can be safely brought into version control.

Basic rules about content security: Never submit real phone numbers, video files, or production environment configuration values to the code repository.

Build And Test

OpenClaw's build and test infrastructure epitomizes its engineering discipline. This chapter breaks down the selection of the build tool chain, the classification of 198 npm scripts, the architectural design of the test infrastructure, and the implementation details of code quality gating.

17.1 Build Toolchain

OpenClaw's build tool selection deliberately avoids the mainstream webpack/rollup/esbuild family bucket model, and instead adopts a more focused tool combination:

Tools	Version	Responsibilities
tsdown	0.21.7	Packager (bundler), driven by scripts/tsdown-build.mjs
TypeScript	6.0.2	Type checking
@typescript/native-preview	7.0.0-dev.20260331.1	Preview version of TypeScript compiler implemented in Go (pnpm tsgo)
oxfmt	0.43.0	Code formatting (replaces Prettier)
oxlint + oxlint-tsgolint	1.58.0 / 0.18.1	Code inspection (replaces ESLint)
Bun	-	TypeScript executor during development/testing
Node 22+	-	Production runtime (maintain Node + Bun dual path compatibility)
tsx	4.21.0	TypeScript execution based on Node
jiti	2.6.1	Runtime ESM resolution (plugin-sdk alias resolution)

A few key points in model selection are worth discussing:

tsdown instead of using esbuild directly: tsdown provides a higher-level packaging abstraction on top of esbuild, and its configuration file is more concise than writing an esbuild plugin directly. The build entry point is scripts/tsdown-build.mjs.

@typescript/native-preview: This is the official Go language rewrite experimental version of TypeScript, called through pnpm tsgo. Its type checking is an order of magnitude faster than the standard TypeScript compiler, and OpenClaw uses it for a fast type checking path in CI.

oxfmt / oxlint: A Rust-based formatting and checking toolchain that replaces the traditional Prettier + ESLint combination. The formatting commands are pnpm format (check) and pnpm format:fix (automatic fix), and the check command is pnpm lint.

Bun + Node dual runtime: Use Bun for faster startup (bun, bunx) during development and testing phases, and use Node 22+ for production deployment to ensure compatibility. Both paths must remain available simultaneously.

17.1.1 Build pipeline and variants

Full build pipeline:

pnpm build

# expands to:

# 1. pnpm canvas:a2ui:bundle → A2UI bundle build

# 2. scripts/tsdown-build.mjs → Main package

# 3. runtime-postbuild.mjs → Runtime post-processing

Three build variants serve different scenarios:

Variations	Command	Purpose
Full build	pnpm build	Contains A2UI bundle + body + post-processing
Docker build	pnpm build:docker	Skip A2UI bundle (may fail under QEMU)
Strict smoke test	pnpm build:strict-smoke	Quickly verify the basic usability of the built product

17.2 198 npm script categories

OpenClaw's package.json contains 198 npm scripts. This number is not inflated—it reflects the density of automation instructions required for an engineering system that covers multiple platforms, multiple channels, and multiple plugins. The following is classified according to responsibilities:

17.2.1 Build Scripts

build, build:docker, build:plugin-sdk:dts, build:strict-smoke — core build pipeline and its variants, plus type declaration generation for the Plugin SDK.

17.2.2 Checks And Lint Scripts (30+)

Using pnpm check as the meta-check entry, arrange tsgo, lint, format, format:check and about 20 specific rule check scripts. See CI architecture analysis in Section 17.6 for details.

17.2.3 Test Scripts (40+)

The test scripts are the largest group: test, test:fast, test:watch, test:coverage, test:e2e, test:live, test:gateway, test:channels, test:extensions, test:contracts, and a series of test:docker:* and test:parallels:* scripts.

17.2.4 Release Scripts

release:check, release:openclaw:npm:check, release:plugins:npm:check - version number, changelog, npm registry consistency check before release.

17.2.5 Platform Scripts

android:*, ios:*, ui:* — quick entry to build, test, and lint each platform.

17.2.6 Documentation Scripts

docs:check-links (dead link detection), docs:spellcheck (spell check), docs:check-i18n-glossary (international glossary consistency).

17.2.7 Protocol Scripts

protocol:check (protocol definition consistency check), protocol:gen (generate TypeScript types), protocol:gen:swift (generate Swift types).

17.3 Test Infrastructure

OpenClaw uses Vitest 4.1.2 as the testing framework, and works with @vitest/coverage-v8 to collect code coverage at the V8 engine level. The coverage threshold is uniformly set to 70%, covering the four dimensions of lines, branches, functions, and statements.

A key mandatory rule: Vitest's concurrency mode only allows the use of forks pools. The threads, vmThreads, and vmForks modes are explicitly disabled. This limitation stems from the large number of process-level side effects involved in OpenClaw testing (subprocess creation, file system operations, network port occupation, etc.), and thread-level isolation cannot provide sufficient isolation guarantees.

Parallel test orchestration is driven by the test-parallel.mjs script and provides three execution configurations:

Configuration	Parallelism	Purpose
default	50% of CPU cores	Daily development, balancing speed and system responsiveness
serial	1	Debug failed use cases and eliminate concurrency interference
max	100% of CPU cores	CI environment, maximize throughput

17.3.1 Test Layers

The test system covers multiple levels, and each level focuses on verification requirements at different granularities:

Unit testing and integration testing: pnpm test (full run), pnpm test:fast (excluding slow use cases), pnpm test:watch (file monitoring mode), pnpm test:coverage (with coverage report).

Domain testing: test:channels (channel integration), test:extensions (extension interface), test:gateway (Gateway protocol), test:e2e (end-to-end process), test:live (real API docking).

Contract Tests (Contract Tests): test:contracts:channels and test:contracts:plugins enforce interface contract consistency of channels and plugins respectively. Contract testing ensures that Channel adapters and Plugins follow their declared interface protocols, preventing implementation drift.

Docker E2E Test (8+ scenarios): Perform end-to-end validation in a complete Docker containerized environment. Covered scenarios include:

Scene	Validation range
onboard	First time boot process
plugins	Plugin installation, loading and execution
MCP channels	MCP protocol channel connectivity
gateway network	Gateway network topology and routing
OpenWebUI	OpenWebUI integration
doctor-switch	doctor diagnosis and configuration switching
qr-import	QR code configuration import
live models	Real model endpoint docking

Parallels Smoke Test: Perform smoke tests on three virtual machine clients of macOS, Windows, and Linux to verify the availability of basic functions across operating systems.

Performance Test Suite:

Script	Measurement dimensions
test:perf:budget	Performance budget check (running time/memory cap)
test:perf:hotspots	Hotspot function profiling
test:perf:imports	Module import time-consuming analysis
test:startup:bench	Startup time baseline
test:startup:memory	Startup memory usage

Live test: Enable by setting the environment variable OPENCLAW_LIVE_TEST=1 and execute pnpm test:live. These tests call external services using real API keys and therefore are not run in regular CI but are executed periodically in a dedicated live test environment.

17.4 Code Quality Gating

OpenClaw has a very high density of gate control measures for code quality, which are detailed below:

File line limit: The check:loc script enforces a file line limit of approximately 500-700 lines. Files exceeding the upper limit will be marked as requiring splitting. This is a soft but enforced coding guideline with the goal of preventing huge files that are difficult to maintain.

Strict type discipline: prohibit the use of @ts-nocheck, avoid using any type, prefer unknown. Prefer using zod@4.3.6 for runtime schema verification at external boundaries (configuration files, webhook payloads, CLI output, API responses).

Dynamic import protection: The build system will detect the presence of both static and dynamic imports in the same module and issue an INEFFECTIVE_DYNAMIC_IMPORT warning. This mixed mode will cause tree-shaking to fail - the module has been packaged through static import, and dynamic import will not bring additional on-demand loading benefits, but will increase the complexity of code understanding.

Duplicate code detection: Use jscpd@4.0.8 to scan the src/, extensions/, test/, scripts/ directories for code duplication. Duplicate blocks that exceed the threshold trigger a CI failure.

Drift detection: A series of check scripts monitor the consistency between various definitions and implementations:

Check	Detect content
canon:check	Standard code style consistency
plugin-sdk:api:check	Plugin SDK public API drift detection
config:docs:check	Configure schema consistency with documents
lint:plugins:plugin-sdk-subpaths-exported	Plugin SDK sub-path export integrity

In addition, there are more than 8 boundary lint rules for specific extensions (Extension) to ensure that each extension module does not cross-border access to the internal API of other extensions.

17.5 CI Architecture

OpenClaw's CI adopts Two-tier Check System to separate local development gating and CI gating:

17.5.1 First layer: pnpm check (local development gate control)

pnpm check is a local check that must be passed before each commit. The execution order is:

# Execution sequence of pnpm check:

# 1. no-conflict-markers → Detect unresolved merge conflict markers

# 2. host-env-policy:swift → Verify Swift host environment policy

# 3. tsgo → Go version of TypeScript type checking

# 4. lint → oxlint code inspection

# 5. format → oxfmt format verification

The pipeline is executed serially. Failure in any step will terminate subsequent steps and report the error location.

17.5.2 Second layer: check-additional (CI exclusive gate control)

check-additional is additionally executed in the CI environment, including architectural policy and boundary policy guards. These checks are intentionally left out of the local development loop - they are generally slower and rely on CI-specific environments (such as full git history, diff information for all branches, etc.), and putting them in the local loop can seriously slow down development.

17.5.3 Pre-commit hooks and fast lanes

Pre-commit hooks are managed by the prek tool, and the hook's default behavior is to execute the full pnpm check pipeline.

For scenarios that require fast iteration, the environment variable FAST_COMMIT=1 can skip the format and check steps:

1 2	# Skip formatting and checking (used when manually ensuring code quality) FAST_COMMIT=1 git commit -m "wip: experimental changes"

Using FAST_COMMIT means that the developer is solely responsible for code quality - CI will still perform a complete check and commits that do not meet the requirements will be blocked at the CI stage.

17.5.4 Main Branch Admission Standard

The entry threshold (Landing Bar) for code integration into the main branch (main) is:

# Main branch access three-piece set:

pnpm check # type + lint + format

pnpm test # Full test

pnpm build # Build verification (when the change involves the build impact area)

The third item, pnpm build, is conditional: only required if the change involves build-affecting surfaces. The definition of build impact areas includes but is not limited to: tsdown-build.mjs configuration changes, package.json dependency changes, tsconfig.json changes, new or deleted module exports, etc. Purely logical changes (function implementation adjustments, bug fixes, etc.) do not trigger build requirements, thus balancing CI speed and security.

Deployment plan

OpenClaw's deployment strategy covers all scenarios from single developers to enterprise teams. Deployment methods are arranged in increasing complexity: npm global installation, Docker containerization, Ansible orchestration, Nix declarative configuration, Windows system tray. Each method corresponds to a different operation philosophy and security model.

npm global installation (recommended method)

For most developers, npm global installation is the fastest way to enter OpenClaw:

1 2	npm install -g openclaw@latest openclaw onboard --install-daemon

The first command installs the OpenClaw CLI and all its dependencies into the global node_modules directory of Node.js. The second command starts the interactive boot wizard (Onboarding Wizard). The --install-daemon flag instructs the wizard to automatically register the system daemon (Daemon) after the process is completed.

How daemons are registered varies by operating system. On macOS, OpenClaw generates a launchd plist file and registers it as a user-level Launch Agent via launchctl load. The key configuration of plist is as follows:

Label

ai.openclaw.gateway

ProgramArguments

/usr/local/bin/node

/usr/local/lib/node_modules/openclaw/dist/gateway.js

RunAtLoad

KeepAlive

On Linux, OpenClaw uses systemd user service (user service), writes the unit file to ~/.config/systemd/user/openclaw-gateway.service, and starts it through systemctl --user enable --now openclaw-gateway. The key to this choice is "user level" - no root permissions are required, the service life cycle is bound to the user session, and the principle of least privilege is followed. Together with loginctl enable-linger, it can keep running even if the user is not logged in.

Docker deployment: in-depth analysis of four-stage construction

Docker deployment is OpenClaw's preferred solution for production and isolation scenarios. Its Dockerfile adopts the Multi-Stage Build mode, which is divided into four stages. Each stage is carefully designed to minimize the final image size.

Phase 1: ext-deps - extended dependency extraction

The only responsibility of the first stage is to extract all package.json files from the extensions/ directory tree while preserving their directory structure. This is a pure file copy phase, no installation operations are performed:

FROM node:24-bookworm AS ext-deps

WORKDIR/app

COPY extensions/ extensions/

RUN find extensions -name "package.json" -exec sh -c \

'mkdir -p /out/$(dirname {}) && cp {} /out/{}' \;

The purpose of this separation is to take advantage of Docker's layer caching mechanism - only when the extension's package.json changes, subsequent dependency installation layers will become invalid.

Phase 2: build — compile and build

The build phase uses Bun as the JavaScript runtime to accelerate dependency installation, while using pnpm as the package manager, and tsdown as the TypeScript compilation tool:

FROM oven/bun:1 AS build

WORKDIR/app

COPY --from=ext-deps /out/extensions ./extensions

COPY package.json pnpm-lock.yaml pnpm-workspace.yaml ./

RUN pnpm install --frozen-lockfile

COPY . .

RUN pnpm run build

RUN cd ui && pnpm run build

The build product includes two parts: TypeScript compilation output in the dist/ directory (generated through tsdown), and Web Control UI static resources in the ui/dist/ directory (built through Vite).

Phase 3: runtime-assets — runtime asset clipping

The third stage is the most critical volume optimization link in the entire build pipeline:

FROM build AS runtime-assets

RUN pnpm prune --prod

RUN find . -name "*.d.ts" -delete \

&& find . -name "*.map" -delete \

&& find . -name "*.ts" ! -name "*.d.ts" -path "*/src/*" -delete

First remove all devDependencies through pnpm prune --prod, and then clear the TypeScript declaration files (.d.ts), Source Map files (.map) and source code files one by one. Ultimately only JavaScript runtime code and production-grade dependencies remain.

Phase 4: runtime - final running image

The final stage is based on streamlined Node 24 images, and all base images use Pinned SHA256 Digest to ensure reproducible builds:

# Default variant

FROM node:24-bookworm@sha256:abc123... AS runtime

# Simplified variant

# FROM node:24-bookworm-slim@sha256:def456... AS runtime

USER node

WORKDIR /home/node/app

COPY --from=runtime-assets --chown=node:node /app ./

HEALTHCHECK --interval=3m --timeout=10s --start-period=30s \

CMD curl -f http://localhost:18789/healthz || exit 1

EXPOSE 18789 18790

CMD ["node", "dist/gateway.js"]

OpenClaw provides two image variants (Variant), selected through the build parameter OPENCLAW_VARIANT:

Variations	Basic image	Features	Applicable scenarios
default	node:24-bookworm	Contains complete Debian tool chain and supports browser installation	Requires Playwright/browser channel
slim	node:24-bookworm-slim	Minimize the system library and reduce the image size by about 40%	Pure CLI/API scenario

The rest of the build parameters (Build Args) include:

Parameter name	Default value	Description
OPENCLAW_EXTENSIONS	all	Control which extensions are built, comma separated or all
OPENCLAW_INSTALL_BROWSER	false	Whether Chromium (Playwright) is pre-installed in the image
OPENCLAW_INSTALL_DOCKER_CLI	false	Whether to install Docker CLI (for sandbox function)

In terms of security, the final image runs as USER node (uid 1000), eliminating root permissions. Health Check configures two endpoints: /healthz is used for Liveness Probe, /readyz is used for Readiness Probe, the detection interval is 3 minutes, the timeout is 10 seconds, and the startup grace period is 30 seconds.

docker-compose.yml: dual-service architecture

The official docker-compose.yml defines two services, reflecting OpenClaw's gateway-client separation architecture:

version: "3.9"

services:

openclaw-gateway:

image: ghcr.io/openclaw/openclaw:latest

ports:

- "18789:18789"

- "18790:18790"

volumes:

- openclaw-data:/home/node/.openclaw

-/var/run/docker.sock:/var/run/docker.sock

security_opt:

- no-new-privileges:true

cap_drop:

- NET_RAW

- NET_ADMIN

restart: unless-stopped

healthcheck:

test: ["CMD", "curl", "-f", "http://localhost:18789/healthz"]

interval: 3m

timeout: 10s

start_period: 30s

openclaw-cli:

image: ghcr.io/openclaw/openclaw:latest

command: ["node", "dist/cli.js"]

depends_on:

openclaw-gateway:

condition: service_healthy

environment:

- OPENCLAW_GATEWAY_URL=http://openclaw-gateway:18789

volumes:

openclaw-data:

Security Hardening is reflected in three aspects: no-new-privileges prevents processes from elevating privileges through setuid/setgid; cap_drop discards NET_RAW and NET_ADMIN capabilities (Capability) to prevent original socket operations and network configuration tampering; Docker Socket mounting (/var/run/docker.sock) provides sandbox integrationDocker-in-Docker (DinD) capability, allowing Agents to execute commands in isolated containers.

In terms of port mapping, 18789 is the main HTTP port of Gateway, which carries REST API, WebSocket connection and Web Control UI; 18790 is the Bridge port for external channel plugins to bridge to Gateway through gRPC or WebSocket.

Ansible deployment: openclaw/openclaw-ansible

openclaw/openclaw-ansible (545 stars) provides a complete set of Ansible Playbooks that packages OpenClaw's Docker deployment into reproducible infrastructure code (Infrastructure as Code). Its core features include:

Tailscale VPN integration: Playbook integrates Tailscale by default, and Gateway is bound to the Tailscale virtual network interface instead of the public network interface. This means that OpenClaw instances are only reachable within Tailnet without exposing any public network ports, fundamentally eliminating the risk of unauthorized access.

UFW firewall configuration: Automatically configure UFW (Uncomplicated Firewall) rules, only allow SSH (22) and Tailscale required ports (41641/UDP), and discard all other inbound traffic.

Docker Isolation: OpenClaw runs in an independent Docker network, and the data volume is mapped to the specified path of the host. It supports customizing mount points, environment variables and resource limits through Ansible variables.

Nix deployment: openclaw/nix-openclaw

openclaw/nix-openclaw (611 stars) provides declarative configuration in the form of Nix Flake. For NixOS users or developers using home-manager, this is the deployment method that best fits their workflow:

{

inputs.openclaw.url = "github:openclaw/nix-openclaw";

outputs = { self, nixpkgs, openclaw }: {

nixosConfigurations.myhost = nixpkgs.lib.nixosSystem {

modules = [

openclaw.nixosModules.default

{

services.openclaw = {

enable = true;

gateway.port = 18789;

extensions = [ "discord" "telegram" "whatsapp" ];

};

}

];

};

}

The advantage of Nix deployment is complete reproducibility - the same Flake input will inevitably produce the same system configuration, eliminating the environmental difference problem of "can it run on my machine".

Windows Deployment: System Tray and PowerToys Integration

openclaw/openclaw-windows-node (405 stars) provides a native integrated experience for Windows. Its core components include:

System Tray Companion: A lightweight .NET application that resides in the Windows system tray. It manages the lifecycle of the OpenClaw Gateway process (start/stop/restart), displays real-time status, and provides shortcut menu access to the Web Control UI and logs.

PowerToys Command Palette Extension (Command Palette Extension): Integrate the command palette (Run plugin) of Microsoft PowerToys. Users can directly send commands to OpenClaw Agent through the Alt+Space shortcut key without switching to a browser or terminal window.

Remote Gateway Configuration (Linux)

Securely exposing services is a core issue when deploying Gateway on remote Linux servers. OpenClaw offers three modes:

Tailscale Serve (Tailnet internal access): Map the Gateway port to Tailscale's HTTPS proxy through tailscale serve, which is accessible only to devices within Tailnet. Cooperate with Gateway's --tailscale serve mode to automatically complete certificate configuration and port mapping.

Tailscale Funnel (public network HTTPS exposure): tailscale funnel mode exposes services to the public network through Tailscale's global edge network and automatically obtains HTTPS certificates. Suitable for scenarios that require external webhook callbacks (such as Telegram Bot).

SSH Tunnel: The most traditional but flexible way to map a remote Gateway to a local one through SSH port forwarding:

1	ssh -L 18789:localhost:18789 -L 18790:localhost:18790 user@remote-host

Gateway's binding mode (Bind Mode) is controlled by the --bind parameter: loopback (default) only listens to 127.0.0.1, and lan listens to 0.0.0.0 to accept LAN connections. Tailscale mode is set through the --tailscale parameter: off (default), serve, funnel.

Boot wizard: openclaw onboard

openclaw onboard is the interactive initialization command of OpenClaw, guiding the user through all configurations from scratch to usable. The process is performed step by step:

Step 1: Gateway configuration—Select the binding mode (loopback/lan), port, and Tailscale mode. If it is detected that a Gateway instance is already running, it will ask whether to reuse it.

Step 2: Workspace configuration—Create the default workspace directory (~/.openclaw/workspace), configure the LLM provider key (OpenAI API Key, Anthropic API Key, etc.), and set the default model.

Step 3: Channel configuration—Enable the channel (Discord Bot Token, Telegram Bot Token, WhatsApp mobile number, etc.) according to the user's choice and verify the validity of the credentials.

Step 4: Skills configuration - It is recommended to install popular Skills and prompt users to browse ClawHub to discover more.

When the --install-daemon flag is included, the system daemon process will be automatically registered and started after the wizard ends. If a problem occurs, the openclaw doctor command performs a comprehensive health check: verifies the Node.js version, checks port occupancy, tests LLM connections, verifies configuration file syntax, and outputs diagnostic reports and repair suggestions.

Ecological Panorama

ClawHub: Official Skill Registration Center

ClawHub (openclaw/clawhub, 7,214 stars) is OpenClaw's official Skill registration center and distribution platform. It's positioned like npm is to Node.js, or crates.io is to Rust—a centralized package registry, but distributing AI Agent capability modules.

Installing Skill only requires one command:

1 2	clawhub install weather-forecast clawhub install code-review --version 2.1.0

ClawHub's Agent integration capability is its core differentiating feature: when the Agent encounters a capability that is needed in a conversation but is not currently installed, it can automatically search ClawHub and pull the installation after user confirmation. The implementation path of this process is that the Agent calls the built-in clawhub_search tool function, which sends a request to api.clawhub.com/v1/search and returns the matching Skill list and its security rating.

Clawhub.com on the Web provides a visual Marketplace interface that supports browsing by category, searching by keyword, viewing installation statistics and community ratings. Each Skill page displays the rendered content, dependency graph, and version history of its SKILL.md.

Skills Archive and Community List

openclaw/skills (3,622 stars) is the version archive repository for all Skills, saving a complete snapshot of each version of each Skill. This design ensures that even if a skill author deletes a version, deployed instances can still be pulled from the archive.

VoltAgent/awesome-openclaw-skills (43,292 stars) is a community-maintained curated list of over 5,400 community-verified Skills. Its Star number reflects the activity of the OpenClaw Skill ecosystem - the Star number of an "awesome list" is usually a bellwether of its ecological scale.

Skills system: three-tier structure

OpenClaw’s Skills are divided into three levels based on distribution methods:

Level	Storage location	Installation method	Update strategy
Built-in Skills (Bundled)	skills/ directory, distributed with npm package	Auto-include	Updated with OpenClaw version
Managed Skills (Managed)	~/.openclaw/managed/skills/	clawhub install	clawhub update
Workspace Skills（Workspace）	~/.openclaw/workspace/skills//	Manually created	User self-management

Loading priority increases from top to bottom - Workspace Skills can override hosted or built-in Skills of the same name, providing maximum customization flexibility.

Skill Development: AgentSkills Specification

Each Skill is defined by a SKILL.md file, which is the core vehicle for the AgentSkills specification. SKILL.md is in Markdown format with special YAML Front Matter:

---

version: 2.1.0

description: Automated code review with multi-language support

triggers:

- pattern: "review {file_path}"

- pattern: "check code quality"

permissions:

- filesystem:read

-git:read

before_install: scripts/check-deps.sh

---

#Code Review Skill

##Instructions

You are a senior code reviewer. When triggered, analyze the provided

file for bugs, style issues, and potential improvements.

## Tools

### review_file

Analyze a single file and return findings.

- `file_path` (string, required): Path to the file to review

- `severity` (string, optional): Minimum severity to report (info|warn|error)

The before_install field specifies a Security Hook script to be executed during the installation phase. This script is used to verify system dependencies (e.g. check Python version, confirm that specific binaries exist) and prevent installation with a non-zero exit code. The security significance of this mechanism is that it provides Skill authors with a declarative pre-checkpoint to prevent incompatible Skills from being installed in an environment that does not have the running conditions.

Official sub-repository matrix

The OpenClaw organization maintains a series of sub-repositories with different functions:

Warehouse	Stars	Positioning
openclaw/acpx	1,834	Headless ACP CLI (Agent Control Protocol command line client)
openclaw/lobster	992	Workflow Shell (Workflow Shell), interactive task orchestration
openclaw/nix-openclaw	611	Nix Flake declarative deployment
openclaw/openclaw-ansible	545	Ansible Playbook automated deployment
openclaw/openclaw-windows-node	405	Windows System Tray + PowerToys Integration
openclaw/openclaw.ai	250	Official website source code
openclaw/community	92	Community Governance Documents and Discord Management Strategies
openclaw/trust	35	Security policy, vulnerability disclosure process, audit report
openclaw/caclawphony	34	Symphony autonomous running framework, long-term unattended Agent tasks

Community competing products and alternatives

OpenClaw does not exist in isolation. The following projects compete or complement it in different dimensions:

HKUDS/nanobot (37,216 stars) is positioned as an "ultra-lightweight OpenClaw alternative", removing the complexity of multi-channel and plugin systems and focusing on extremely fast response in a single CLI scenario. Its core selling points are sub-second cold start and extremely low memory usage, making it suitable for resource-constrained edge devices.

chatgpt-on-wechat/CowAgent (42,673 stars) takes the WeChat ecosystem as the core battlefield and provides in-depth integration of corporate WeChat, public accounts, mini programs, etc. In the Chinese market, WeChat’s penetration rate makes it a natural entry point for AI Agents, and CowAgent’s coverage in this vertical field exceeds OpenClaw’s.

AstrBot (28,373 stars) is an IM chatbot framework that supports mainstream domestic instant messaging platforms such as QQ, Feishu, and DingTalk. Different from OpenClaw's platform independence, AstrBot chooses to delve deeply into the domestic ecology and provide API designs that are more in line with the habits of domestic developers.

Documentation and Internationalization

OpenClaw's documentation is built on Mintlify and deployed at docs.openclaw.ai. The Chinese localized version is located at docs.openclaw.ai/zh-CN.

The technical implementation of the internationalization (i18n) pipeline deserves attention: the translation work is driven by the scripts/docs-i18n script, which reads glossary.zh-CN.json as the glossary (Glossary) to ensure consistent translation of proper nouns (for example, "Gateway" is always translated as "Gateway", "Skill" remains in English). Translation Memory is stored in the zh-CN.tm.jsonl file, using the JSON Lines format, with each line pairing source text and translation. The role of this file is similar to the TMX file in traditional localization tools - when the source text changes, the i18n script first looks for an exact match or fuzzy match in the translation memory to avoid duplicating the translation of existing content.

The main forum for community communication is Discord (discord.gg/clawd), which is also the main channel for obtaining real-time development progress and direct communication with maintainers.

Competitive products and prospects

Full-dimensional comparison of competing products

The following table compares OpenClaw with current mainstream AI Agent frameworks from multiple dimensions:

Dimensions	OpenClaw	Manus	AutoGen	LangChain	OpenHands
Agreement	MIT	Closed Source SaaS	MIT (CC-BY-4.0 docs)	MIT	MIT
Deployment method	Local first + Docker + Ansible	Pure Cloud	Local/Cloud	Local/Cloud	Docker container
Main language	TypeScript	Unpublished	Python	Python	Python
Multiple channels	Discord/Telegram/WhatsApp/Slack/Web/SMS, etc. 15+	Web only entrance	No native channel	No native channel	Web UI
Voice support	Native Realtime API	Yes	None	Community extension	None
Plugin system	Skills + Plugin SDK + ClawHub	Built-in Tools	Tool Registration	Tools/chain/agent	Sandbox tools
Memory system	SQLite + vector + knowledge graph	Cloud conversation history	Memory status	Multiple Memory Types	Conversation History
Security Model	Three-layer sandbox + permission DSL + audit log	Platform hosting	No built-in sandbox	No built-in sandbox	Docker Sandbox
Community size	342K stars, 20K+ commits	N/A (closed source)	42K stars	105K stars	55K stars

OpenClaw’s differentiated positioning

OpenClaw’s three core differentiating pillars can be extracted from the comparison:

Local-First: All data is stored on the user's device by default. The Gateway process runs locally or on a user-controlled server, and LLM API calls are issued directly from the user device without going through any third-party relay server. This design is particularly critical for enterprise users who are subject to data compliance requirements - regulations such as GDPR and HIPAA have strict restrictions on data leaving the country, and a local-first architecture naturally meets these requirements.

Multi-Channel Native support (Multi-Channel Native): Instead of packaging a single interface into multiple channels through an adapter (Adapter), the channel (Channel) is treated as a first-class citizen (First-Class Citizen) at the architectural level. Each channel has an independent message formatter (Formatter), permission model and user identity mapping.

MIT fully open: There is no Enterprise version and no closed source components that retain core functions. All features are completely free for all users. This strategy relies heavily on sponsors and community contributions in terms of business model, but is extremely effective in building trust.

Sponsor strategy analysis

Why are OpenAI, NVIDIA, and Vercel sponsoring a local-first "contender"? There is a clear strategic logic behind this seemingly contradictory sponsorship relationship:

OpenAI: Every Agent call from OpenClaw consumes OpenAI’s API Token. Local first ≠ no cloud model, quite the opposite - OpenClaw is the super distribution channel for OpenAI APIs. Every OpenClaw user is a potential API paying user and uses it much more frequently than the average ChatGPT user. Sponsored OpenClaw is an Ecosystem Lock-in strategy: when developers get used to the OpenClaw + GPT-4 workflow, the cost of switching to other models will increase significantly.

NVIDIA: Local Inference is one of the long-term evolution directions of OpenClaw. When users started running open source LLM locally, demand for GPU computing power translated directly into hardware sales for NVIDIA. Sponsoring OpenClaw is cultivating market demand for Local Inference.

Vercel: OpenClaw’s Web Control UI, documentation site, and ClawHub market can all be deployed on the Vercel platform. Sponsoring open source projects is Vercel's standard move to expand the developer tool ecosystem, and is consistent with its logic of sponsoring Next.js and Turborepo.

Challenges and Limitations

Objectively assessed, OpenClaw faces the following structural challenges:

16,843 Open Issues: As of April 2026, the repository had more than 16,000 open issues. That reflects extremely high user participation, but it also means the maintenance team faces heavy triage pressure. If a large share of issues remain unanswered for too long, community trust will erode.

Node.js environment threshold: For non-JavaScript developers, installing and maintaining a Node.js environment can be a barrier in itself. The Python ecosystem's AutoGen and LangChain have a natural advantage in this regard - Python's installation and environment management are more friendly to data scientists and researchers.

Single maintainer risk (Bus Factor): steipete contributed 14,756 commits, accounting for about 73% of the total commits (20,000+). The gap between the second contributor (vincentkoc, 1,690 commits) is huge. This means that the project is highly dependent on a single person, with a Bus Factor close to 1 - if the core maintainer is unable to continue for any reason, the project's survival will be seriously threatened.

Radical refactoring rhythm: Almost every version contains breaking changes (Breaking Changes). Frequent changes to plugin APIs make community Skills expensive to maintain—a Skill can become obsolete within weeks due to upstream API changes. This tension between "rapid iteration" and "stable platform" is OpenClaw's current biggest architectural risk.

Future evolution direction

TypeScript native compiler: OpenClaw is tracking @typescript/native-preview 7.0.0-dev (a TypeScript native compiler implemented in Go). The compiler promises over 10x compilation speed improvements, which will have a significant impact on the OpenClaw development experience and CI/CD pipeline efficiency. There is already an experimental branch in the repository to adapt to the features of the new compiler.

Plugin SDK stabilization: The current plugin system has multiple legacy paths (Legacy Paths), including old version import methods that have been abandoned but have not yet been deleted. The core goal of SDK stabilization is to determine a long-term unchanged API surface (API Surface) and mark all old paths as deprecated and remove them in subsequent versions.

WeChat official integration: Cooperation with Tencent will bring official support for WeChat channels, instead of the current indirect integration that relies on third-party libraries. This is crucial for penetration in the Chinese market - WeChat has more than 1.3 billion monthly active users, and official channels mean a more stable API and lower risk of account suspension.

Enterprise adoption path: The combination of Ansible Playbooks + Docker containers + sandbox isolation has paved the way for enterprise deployments. The next step is to complete RBAC (role-based access control), audit log export (Audit Log Export) to the SIEM system, and SSO (single sign-on) integration.

Final comment: Representative of the local-first AI Agent paradigm

OpenClaw represents a clear technical philosophy: AI Agents should run on user-controlled infrastructure, and data should not leave the user's trust boundary. This position seems to go against the current trend of "everything is in the cloud" industry, but it is this countercurrent that gives it unique value.

From an implementation standpoint, OpenClaw has already built channel abstraction, plugin isolation, a three-layer sandbox, vector-backed memory, and native apps across platforms inside a TypeScript monorepo. The maturity of the codebase and the architecture is far ahead of what its four-month age would suggest. That said, the concentration of work around a single maintainer, the pace of aggressive refactoring, and the growing issue backlog still create real risk.

For developers, OpenClaw is the most complete open source local-first AI Agent platform available. For enterprises, its combination of Docker + Ansible + sandbox provides an auditable, isolated, and reproducible deployment path. For the AI industry, it proves that "local first" is not a compromise, but an architectural paradigm that can compete head-on with cloud SaaS competitors in terms of functional completeness.

Reference resources

Resources	Link
GitHub main repository	github.com/openclaw/openclaw
Official website	openclaw.ai
English documentation	docs.openclaw.ai
Chinese Document	docs.openclaw.ai/zh-CN
Discord Community	discord.gg/clawd
ClawHub Marketplace	clawhub.com
ClawHub source code	github.com/openclaw/clawhub
Star Growth Curve	star-history.com/#openclaw/openclaw
DeepWiki Analysis	deepwiki.com/openclaw/openclaw
Security Trust Center	trust.openclaw.ai
Safe contact email	security@openclaw.ai

Official sub-repository navigation

Warehouse	Stars	Description
openclaw/openclaw	343,696	Main repository: CLI, Gateway, Agent runtime, Plugin SDK
openclaw/clawhub	7,214	Official Skill Directory Platform
openclaw/skills	3,622	ClawHub archive of all Skill versions
openclaw/acpx	1,834	Headless ACP CLI: Stateful Agent Client Protocol Session
openclaw/lobster	992	Lobster Workflow Shell: Typed JSON Pipeline + Approval Gate
openclaw/nix-openclaw	611	Nix declarative packaging support
openclaw/openclaw-ansible	545	Ansible automated deployment (Tailscale + UFW + Docker)
openclaw/openclaw-windows-node	405	Windows System Tray + PowerToys Command Panel Extension
openclaw/openclaw.ai	250	Official website source code
openclaw/trust	35	Security trust strategy and threat model
openclaw/caclawphony	34	Symphony: Project tasks → Isolated and autonomous execution

CLI command quick review

Command	Purpose
openclaw onboard	Interactive guided installation (Gateway + Channel + Skills)
openclaw gateway run	Start the Gateway control plane
openclaw agent --message "..."	Send message to Agent
openclaw channels status --probe	Check the connection status of all channels
openclaw channels login	Channel login (such as WhatsApp QR code scanning)
openclaw pairing approve	Approve DM match request
openclaw doctor	Diagnosing configuration issues and security risks
openclaw config set	Modify configuration items
openclaw update --channel	Switch publishing channel and update
openclaw message send --to	Send message to specified target
openclaw gateway status	View Gateway running status
openclaw nodes list	List connected device nodes
clawhub install	Install Skill from ClawHub

In-chat command quick check

The following commands can be sent directly in conversations on WhatsApp, Telegram, Slack, Discord, Teams, WebChat and more:

Command	Function
/status	View current session status (model + token usage + fee)
/new or /reset	Reset session
/compact	Compress session context (generate summary)
/think	Set thinking level: off\|minimal\|low\|medium\|high\|xhigh
/verbose on\|off	Control verbose output
/usage off\|tokens\|full	Display usage statistics after each reply
/restart	Restart Gateway (only owner is available in the group)
/activation mention\|always	Group activation mode switching
/elevated on\|off	Toggle elevated bash access
/approve	Approve pending tool execution or plugin operations
/acp spawn codex --bind here	Create ACP workspace in current session

Data declaration

All technical details in this article are derived from the following primary data and do not rely on any secondary community information:

GitHub API (api.github.com/repos/openclaw/openclaw): repository metadata, Stars/Forks/Issues statistics, contributor ranking, and release descriptions
AGENTS.md (repository root directory, 35,263 bytes): architectural specifications, module boundaries, build guidelines, testing strategies, release guidelines
package.json (repository root directory): 233 exports entries (including 230 plugin-sdk sub-paths), 47 runtime dependencies, 22 development dependencies, 198 npm scripts
Dockerfile and docker-compose.yml: complete containerized build and deployment configuration
README.md (GitHub API base64 decoding): official feature list, channel support, installation guide, security model
GitHub Releases API (v2026.3.31, v2026.3.28 Release Notes): Breaking changes and new feature details

Minimum available configuration

After completing openclaw onboard, the system will generate a minimum configuration file ~/.openclaw/openclaw.json. When configuring manually, the minimum available JSON is as follows:

{

"agent": {

"model": "anthropic/claude-sonnet-4-6"

}

You only need to specify a model to start the Gateway. The agent uses this model for all inference tasks. More complex configurations can declare multi-model failover, channel access, security policies, sandbox mode, and Skills:

{

"agent": {

"model": "openai/gpt-5.2",

"fallbackModels": ["anthropic/claude-sonnet-4-6", "google/gemini-2.5-flash"],

"thinkingLevel": "medium"

"agents": {

"defaults": {

"workspace": "~/.openclaw/workspace",

"sandbox": {

"mode": "non-main"

}

"channels": {

"telegram": {

"botToken": "123456:ABCDEF",

"dmPolicy": "pairing",

"allowFrom": []

"discord": {

"token": "your-discord-bot-token",

"dmPolicy": "pairing"

"whatsapp": {

"allowFrom": ["+1234567890"]

}

"browser": {

"enabled": true

"gateway": {

"mode": "local",

"auth": {

"mode": "token"

}

See docs.openclaw.ai/gateway/configuration for complete configuration reference. Configuration files support JSON5 format (allowing comments and trailing commas), which is a design choice for human readability.

WeChat access: Tencent official plugin

For Chinese user groups, WeChat access is a high-priority requirement. OpenClaw's WeChat support is implemented through the npm package @tencent-weixin/openclaw-weixin officially released by Tencent, based on the iLink Bot API. This is a landmark integration - marking Tencent’s official participation in the open source AI Agent ecosystem.

Installation and activation process:

# Install WeChat plugin

openclaw plugins install "@tencent-weixin/openclaw-weixin"

#Scan code to log in

openclaw channels login --channel openclaw-weixin

WeChat integration currently only supports Private Chat and does not support group chat. v2.x version requires OpenClaw >=2026.3.22. Users need to enable the "WeChat ClawBot plugin" in the WeChat client (Me → Settings → Plugins) - this feature is gradually released in grayscale by Tencent.

This WeChat access path implemented through the official npm package rather than reverse engineering avoids the risk of account bans faced by projects such as chatgpt-on-wechat, and has higher stability and compliance. However, the current limited functionality (only private chat) also reflects Tencent’s cautious attitude when opening up the WeChat ecosystem.

← Replacing Docker Desktop with Colima on macOS

DevPod on Kubernetes: turning devcontainer.json into a persistent remote workspace →

OpenClaw: Architecture, Components, and Deployment Notes

OpenClaw: Architecture, Components, and Deployment Notes

Leave a Reply Cancel reply

ABOUT ME

ABOUT GMEM

GMEM HISTORY

MIRROR INFO

Meta

Recent Posts

TOPLINKS

Recent Comments