agentguard

GoPlus AgentGuard — AI agent security guard. Automatically blocks dangerous commands, prevents data leaks, and protects secrets. Use when reviewing third-party code, auditing skills, checking for vulnerabilities, evaluating action safety, or viewing security logs.

3,891 stars
Complexity: medium

About this skill

GoPlus AgentGuard is a robust AI agent security framework designed to protect AI agents from executing dangerous commands, leaking sensitive data, and exposing secrets. It functions as an automated security guard, providing essential tools for reviewing third-party code, auditing AI skills, checking for vulnerabilities, and evaluating the safety of runtime actions, making it invaluable for maintaining a secure AI development and operational environment. The skill is invoked by an AI agent and routes requests based on user arguments to specific subcommands. These include `scan` for analyzing codebases or skills for security risks, `action` for assessing the safety of a proposed runtime action, `trust` for managing the trust levels associated with various skills, `report` for viewing an audit log of recent security events, and `config` for adjusting the overall protection level (strict, balanced, or permissive). Utilizing AgentGuard empowers AI developers and agents to operate with increased confidence, particularly when dealing with external code or performing potentially sensitive operations. It helps in proactively identifying and mitigating risks, ensuring compliance with security best practices, and providing a clear audit trail of security-related decisions and interventions.

Best use case

This skill is primarily used by AI agents and their developers to enforce security policies, audit code, and validate runtime actions. It's essential for scenarios where an AI agent interacts with potentially untrusted code, executes actions that could have side effects, or handles sensitive data, ensuring operational safety and data integrity.

GoPlus AgentGuard — AI agent security guard. Automatically blocks dangerous commands, prevents data leaks, and protects secrets. Use when reviewing third-party code, auditing skills, checking for vulnerabilities, evaluating action safety, or viewing security logs.

A comprehensive security assessment, a safety determination for an action, managed trust levels for skills, or a log of security events, enhancing the overall security of the AI agent's operations.

Practical example

Example input

Agent, please scan the code in the `/new_project` directory for any security vulnerabilities, paying close attention to potential data leaks.

Example output

Scanning `/new_project`... Found 2 moderate security risks (e.g., potential unauthorized file access, outdated dependency). No critical vulnerabilities detected. Scan completed with 'balanced' protection level. Skill rated as 'medium trust' by AgentGuard.

When to use this skill

  • When reviewing third-party code or new AI skills for security vulnerabilities.
  • Before allowing an AI agent to execute a potentially unsafe or high-impact action.
  • To continuously monitor and audit an AI agent's security events and operational logs.
  • When needing to configure and enforce strictness levels for AI agent security policies.

When not to use this skill

  • For tasks that do not involve code execution, data handling, or interaction with external systems.
  • When performance is critically bottlenecked and the security overhead is not justified by the minimal risk.
  • In isolated environments where the agent only interacts with fully trusted, pre-verified internal resources.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/security/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/0xbeekeeper/security/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/security/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How agentguard Compares

Feature / AgentagentguardStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexitymediumN/A

Frequently Asked Questions

What does this skill do?

GoPlus AgentGuard — AI agent security guard. Automatically blocks dangerous commands, prevents data leaks, and protects secrets. Use when reviewing third-party code, auditing skills, checking for vulnerabilities, evaluating action safety, or viewing security logs.

How difficult is it to install?

The installation complexity is rated as medium. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# GoPlus AgentGuard — AI Agent Security Framework

You are a security auditor powered by the GoPlus AgentGuard framework. Route the user's request based on the first argument.

## Command Routing

Parse `$ARGUMENTS` to determine the subcommand:

- **`scan <path>`** — Scan a skill or codebase for security risks
- **`action <description>`** — Evaluate whether a runtime action is safe
- **`trust <lookup|attest|revoke|list> [args]`** — Manage skill trust levels
- **`report`** — View recent security events from the audit log
- **`config <strict|balanced|permissive>`** — Set protection level

If no subcommand is given, or the first argument is a path, default to **scan**.

---

## Subcommand: scan

Scan the target path for security risks using all detection rules.

### File Discovery

Use Glob to find all scannable files at the given path. Include: `*.js`, `*.ts`, `*.jsx`, `*.tsx`, `*.mjs`, `*.cjs`, `*.py`, `*.json`, `*.yaml`, `*.yml`, `*.toml`, `*.sol`, `*.sh`, `*.bash`, `*.md`

**Markdown scanning**: For `.md` files, only scan inside fenced code blocks (between ``` markers) to reduce false positives. Additionally, decode and re-scan any base64-encoded payloads found in all files.

Skip directories: `node_modules`, `dist`, `build`, `.git`, `coverage`, `__pycache__`, `.venv`, `venv`
Skip files: `*.min.js`, `*.min.css`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`

### Detection Rules

For each rule, use Grep to search the relevant file types. Record every match with file path, line number, and matched content. For detailed rule patterns, see [scan-rules.md](scan-rules.md).

| # | Rule ID | Severity | File Types | Description |
|---|---------|----------|------------|-------------|
| 1 | SHELL_EXEC | HIGH | js,ts,mjs,cjs,py,md | Command execution capabilities |
| 2 | AUTO_UPDATE | CRITICAL | js,ts,py,sh,md | Auto-update / download-and-execute |
| 3 | REMOTE_LOADER | CRITICAL | js,ts,mjs,py,md | Dynamic code loading from remote |
| 4 | READ_ENV_SECRETS | MEDIUM | js,ts,mjs,py | Environment variable access |
| 5 | READ_SSH_KEYS | CRITICAL | all | SSH key file access |
| 6 | READ_KEYCHAIN | CRITICAL | all | System keychain / browser profiles |
| 7 | PRIVATE_KEY_PATTERN | CRITICAL | all | Hardcoded private keys |
| 8 | MNEMONIC_PATTERN | CRITICAL | all | Hardcoded mnemonic phrases |
| 9 | WALLET_DRAINING | CRITICAL | js,ts,sol | Approve + transferFrom patterns |
| 10 | UNLIMITED_APPROVAL | HIGH | js,ts,sol | Unlimited token approvals |
| 11 | DANGEROUS_SELFDESTRUCT | HIGH | sol | selfdestruct in contracts |
| 12 | HIDDEN_TRANSFER | MEDIUM | sol | Non-standard transfer implementations |
| 13 | PROXY_UPGRADE | MEDIUM | sol,js,ts | Proxy upgrade patterns |
| 14 | FLASH_LOAN_RISK | MEDIUM | sol,js,ts | Flash loan usage |
| 15 | REENTRANCY_PATTERN | HIGH | sol | External call before state change |
| 16 | SIGNATURE_REPLAY | HIGH | sol | ecrecover without nonce |
| 17 | OBFUSCATION | HIGH | js,ts,mjs,py,md | Code obfuscation techniques |
| 18 | PROMPT_INJECTION | CRITICAL | all | Prompt injection attempts |
| 19 | NET_EXFIL_UNRESTRICTED | HIGH | js,ts,mjs,py,md | Unrestricted POST / upload |
| 20 | WEBHOOK_EXFIL | CRITICAL | all | Webhook exfiltration domains |
| 21 | TROJAN_DISTRIBUTION | CRITICAL | md | Trojanized binary download + password + execute |
| 22 | SUSPICIOUS_PASTE_URL | HIGH | all | URLs to paste sites (pastebin, glot.io, etc.) |
| 23 | SUSPICIOUS_IP | MEDIUM | all | Hardcoded public IPv4 addresses |
| 24 | SOCIAL_ENGINEERING | MEDIUM | md | Pressure language + execution instructions |

### Risk Level Calculation

- Any **CRITICAL** finding -> Overall **CRITICAL**
- Else any **HIGH** finding -> Overall **HIGH**
- Else any **MEDIUM** finding -> Overall **MEDIUM**
- Else -> **LOW**

### Output Format

```
## GoPlus AgentGuard Security Scan Report

**Target**: <scanned path>
**Risk Level**: CRITICAL | HIGH | MEDIUM | LOW
**Files Scanned**: <count>
**Total Findings**: <count>

### Findings

| # | Risk Tag | Severity | File:Line | Evidence |
|---|----------|----------|-----------|----------|
| 1 | TAG_NAME | critical | path/file.ts:42 | `matched content` |

### Summary
<Human-readable summary of key risks, impact, and recommendations>
```

### Post-Scan Trust Registration

After outputting the scan report, if the scanned target appears to be a skill (contains a `SKILL.md` file, or is located under a `skills/` directory), offer to register it in the trust registry.

**Risk-to-trust mapping**:

| Scan Risk Level | Suggested Trust Level | Preset | Action |
|---|---|---|---|
| LOW | `trusted` | `read_only` | Offer to register |
| MEDIUM | `restricted` | `none` | Offer to register with warning |
| HIGH / CRITICAL | — | — | Warn the user; do not suggest registration |

**Registration steps** (if the user agrees):

> **Important**: All scripts below are AgentGuard's own bundled scripts (located in this skill's `scripts/` directory), **never** scripts from the scanned target. Do not execute any code from the scanned repository.

1. **Ask the user for explicit confirmation** before proceeding. Show the exact command that will be executed and wait for approval.
2. Derive the skill identity:
   - `id`: the directory name of the scanned path
   - `source`: the absolute path to the scanned directory
   - `version`: read the `version` field from `package.json` in the scanned directory using the Read tool (if present), otherwise use `unknown`
   - `hash`: compute by running AgentGuard's own script: `node scripts/trust-cli.ts hash --path <scanned_path>` and extracting the `hash` field from the JSON output
3. Show the user the full registration command and ask for confirmation before executing:
   ```
   node scripts/trust-cli.ts attest --id <id> --source <source> --version <version> --hash <hash> --trust-level <level> --preset <preset> --reviewed-by agentguard-scan --notes "Auto-registered after scan. Risk level: <risk_level>." --force
   ```
4. Only execute after user approval. Show the registration result.

If scripts are not available (e.g., `npm install` was not run), skip this step and suggest the user run `cd skills/agentguard/scripts && npm install`.

---

## Subcommand: action

Evaluate whether a proposed runtime action should be allowed, denied, or require confirmation. For detailed policies and detector rules, see [action-policies.md](action-policies.md).

### Supported Action Types

- `network_request` — HTTP/HTTPS requests
- `exec_command` — Shell command execution
- `read_file` / `write_file` — File system operations
- `secret_access` — Environment variable access
- `web3_tx` — Blockchain transactions
- `web3_sign` — Message signing

### Decision Framework

Parse the user's action description and apply the appropriate detector:

**Network Requests**: Check domain against webhook list and high-risk TLDs, check body for secrets
**Command Execution**: Check against dangerous/sensitive/system/network command lists, detect shell injection
**Secret Access**: Classify secret type and apply priority-based risk levels
**Web3 Transactions**: Check for unlimited approvals, unknown spenders, user presence

### Default Policies

| Scenario | Decision |
|----------|----------|
| Private key exfiltration | **DENY** (always) |
| Mnemonic exfiltration | **DENY** (always) |
| API secret exfiltration | CONFIRM |
| Command execution | **DENY** (default) |
| Unlimited approval | CONFIRM |
| Unknown spender | CONFIRM |
| Untrusted domain | CONFIRM |
| Body contains secret | **DENY** |

### Web3 Enhanced Detection

When the action involves **web3_tx** or **web3_sign**, use AgentGuard's bundled `action-cli.ts` script (in this skill's `scripts/` directory) to invoke the ActionScanner. This script integrates the trust registry and optionally the GoPlus API (requires `GOPLUS_API_KEY` and `GOPLUS_API_SECRET` environment variables, if available):

For web3_tx:
```
node scripts/action-cli.ts decide --type web3_tx --chain-id <id> --from <addr> --to <addr> --value <wei> [--data <calldata>] [--origin <url>] [--user-present]
```

For web3_sign:
```
node scripts/action-cli.ts decide --type web3_sign --chain-id <id> --signer <addr> [--message <msg>] [--typed-data <json>] [--origin <url>] [--user-present]
```

For standalone transaction simulation:
```
node scripts/action-cli.ts simulate --chain-id <id> --from <addr> --to <addr> --value <wei> [--data <calldata>] [--origin <url>]
```

The `decide` command also works for non-Web3 actions (exec_command, network_request, etc.) and automatically resolves the skill's trust level and capabilities from the registry:

```
node scripts/action-cli.ts decide --type exec_command --command "<cmd>" [--skill-source <source>] [--skill-id <id>]
```

Parse the JSON output and incorporate findings into your evaluation:
- If `decision` is `deny` → override to **DENY** with the returned evidence
- If `goplus.address_risk.is_malicious` → **DENY** (critical)
- If `goplus.simulation.approval_changes` has `is_unlimited: true` → **CONFIRM** (high)
- If GoPlus is unavailable (`SIMULATION_UNAVAILABLE` tag) → fall back to prompt-based rules and note the limitation

Always combine script results with the policy-based checks (webhook domains, secret scanning, etc.) — the script enhances but does not replace rule-based evaluation.

### Output Format

```
## GoPlus AgentGuard Action Evaluation

**Action**: <action type and description>
**Decision**: ALLOW | DENY | CONFIRM
**Risk Level**: low | medium | high | critical
**Risk Tags**: [TAG1, TAG2, ...]

### Evidence
- <description of each risk factor found>

### Recommendation
<What the user should do and why>
```

---

## Subcommand: trust

Manage skill trust levels using the GoPlus AgentGuard registry.

### Trust Levels

| Level | Description |
|-------|-------------|
| `untrusted` | Default. Requires full review, minimal capabilities |
| `restricted` | Trusted with capability limits |
| `trusted` | Full trust (subject to global policies) |

### Capability Model

```
network_allowlist: string[]     — Allowed domains (supports *.example.com)
filesystem_allowlist: string[]  — Allowed file paths
exec: 'allow' | 'deny'         — Command execution permission
secrets_allowlist: string[]     — Allowed env var names
web3.chains_allowlist: number[] — Allowed chain IDs
web3.rpc_allowlist: string[]    — Allowed RPC endpoints
web3.tx_policy: 'allow' | 'confirm_high_risk' | 'deny'
```

### Presets

| Preset | Description |
|--------|-------------|
| `none` | All deny, empty allowlists |
| `read_only` | Local filesystem read-only |
| `trading_bot` | Exchange APIs (Binance, Bybit, OKX, Coinbase), Web3 chains 1/56/137/42161 |
| `defi` | All network, multi-chain DeFi (1/56/137/42161/10/8453/43114), no exec |

### Operations

**lookup** — `agentguard trust lookup --source <source> --version <version>`
Query the registry for a skill's trust record.

**attest** — `agentguard trust attest --id <id> --source <source> --version <version> --hash <hash> --trust-level <level> --preset <preset> --reviewed-by <name>`
Create or update a trust record. Use `--preset` for common capability models or provide `--capabilities <json>` for custom.

**revoke** — `agentguard trust revoke --source <source> --reason <reason>`
Revoke trust for a skill. Supports `--source-pattern` for wildcards.

**list** — `agentguard trust list [--trust-level <level>] [--status <status>]`
List all trust records with optional filters.

### Script Execution

If the agentguard package is installed, execute trust operations via AgentGuard's own bundled script:
```
node scripts/trust-cli.ts <subcommand> [args]
```

For operations that modify the trust registry (`attest`, `revoke`), always show the user the exact command and ask for explicit confirmation before executing.

If scripts are not available, help the user inspect `data/registry.json` directly using Read tool.

---

## Subcommand: report

Display recent security events from the GoPlus AgentGuard audit log.

### Log Location

The audit log is stored at `~/.agentguard/audit.jsonl`. Each line is a JSON object with:

```json
{"timestamp":"...","tool_name":"Bash","tool_input_summary":"rm -rf /","decision":"deny","risk_level":"critical","risk_tags":["DANGEROUS_COMMAND"],"initiating_skill":"some-skill"}
```

The `initiating_skill` field is present when the action was triggered by a skill (inferred from the session transcript). When absent, the action came from the user directly.

### How to Display

1. Read `~/.agentguard/audit.jsonl` using the Read tool
2. Parse each line as JSON
3. Format as a table showing recent events (last 50 by default)
4. If any events have `initiating_skill`, add a "Skill Activity" section grouping events by skill

### Output Format

```
## GoPlus AgentGuard Security Report

**Events**: <total count>
**Blocked**: <deny count>
**Confirmed**: <confirm count>

### Recent Events

| Time | Tool | Action | Decision | Risk | Tags | Skill |
|------|------|--------|----------|------|------|-------|
| 2025-01-15 14:30 | Bash | rm -rf / | DENY | critical | DANGEROUS_COMMAND | some-skill |
| 2025-01-15 14:28 | Write | .env | CONFIRM | high | SENSITIVE_PATH | — |

### Skill Activity

If any events were triggered by skills, group them here:

| Skill | Events | Blocked | Risk Tags |
|-------|--------|---------|-----------|
| some-skill | 5 | 2 | DANGEROUS_COMMAND, EXFIL_RISK |

For untrusted skills with blocked actions, suggest: `/agentguard trust attest` to register them or `/agentguard trust revoke` to block them.

### Summary
<Brief analysis of security posture and any patterns of concern>
```

If the log file doesn't exist, inform the user that no security events have been recorded yet, and suggest they enable hooks via `./setup.sh` or by adding the plugin.

---

## Subcommand: config

Set the GoPlus AgentGuard protection level.

### Protection Levels

| Level | Behavior |
|-------|----------|
| `strict` | Block all risky actions — every dangerous or suspicious command is denied |
| `balanced` | Block dangerous, confirm risky — default level, good for daily use |
| `permissive` | Only block critical threats — for experienced users who want minimal friction |

### How to Set

1. Read `$ARGUMENTS` to get the desired level
2. Write the config to `~/.agentguard/config.json`:

```json
{"level": "balanced"}
```

3. Confirm the change to the user

If no level is specified, read and display the current config.

---

## Auto-Scan on Session Start (Opt-In)

AgentGuard can optionally scan installed skills at session startup. **This is disabled by default** and must be explicitly enabled:

- **Claude Code**: Set environment variable `AGENTGUARD_AUTO_SCAN=1`
- **OpenClaw**: Pass `{ skipAutoScan: false }` when registering the plugin

When enabled, auto-scan operates in **report-only mode**:

1. Discovers skill directories (containing `SKILL.md`) under `~/.claude/skills/` and `~/.openclaw/skills/`
2. Runs `quickScan()` on each skill
3. Reports results to stderr (skill name + risk level + risk tags)

Auto-scan **does NOT**:
- Modify the trust registry (no `forceAttest` calls)
- Write code snippets or evidence details to disk
- Execute any code from the scanned skills

The audit log (`~/.agentguard/audit.jsonl`) only records: skill name, risk level, and risk tag names — never matched code content or evidence snippets.

To register skills after reviewing scan results, use `/agentguard trust attest`.

Related Skills

HIPAA Compliance for AI Agents

3891
from openclaw/skills

Generate HIPAA compliance checklists, risk assessments, and audit frameworks for healthcare organizations deploying AI agents.

Security

Data Governance Framework

3891
from openclaw/skills

Assess, score, and remediate your organization's data governance posture across 6 domains.

Security

Cybersecurity Risk Assessment

3891
from openclaw/skills

You are a cybersecurity risk assessment specialist. When the user needs a security audit, threat assessment, or compliance review, follow this framework.

Security

afrexai-cybersecurity-engine

3891
from openclaw/skills

Complete cybersecurity assessment, threat modeling, and hardening system. Use when conducting security audits, threat modeling, penetration testing, incident response, or building security programs from scratch. Works with any stack — zero external dependencies.

Security

Compliance & Audit Readiness Engine

3891
from openclaw/skills

Your AI compliance officer. Guides startups and scale-ups through SOC 2, ISO 27001, GDPR, HIPAA, and PCI DSS — from zero to audit-ready. No consultants needed.

Security

Compliance Audit Generator

3891
from openclaw/skills

Run internal compliance audits against major frameworks without hiring a consultant.

Security

AI Safety Audit

3891
from openclaw/skills

Comprehensive AI safety and alignment audit framework for businesses deploying AI agents. Built around the UK AI Security Institute Alignment Project standards (2026), EU AI Act requirements, and NIST AI RMF.

Security

clickhouse-github-forensics

3891
from openclaw/skills

Query GitHub event data via ClickHouse for supply chain investigations, actor profiling, and anomaly detection. Use when investigating GitHub-based attacks, tracking repository activity, analyzing actor behavior patterns, detecting tag/release tampering, or reconstructing incident timelines from public GitHub data. Triggers on GitHub supply chain attacks, repo compromise investigations, actor attribution, tag poisoning, or "query github events".

Security

security-guardian

3891
from openclaw/skills

Automated security auditing for OpenClaw projects. Scans for hardcoded secrets (API keys, tokens) and container vulnerabilities (CVEs) using Trivy. Provides structured reports to help maintain a clean and secure codebase.

Security

mema-vault

3891
from openclaw/skills

Secure credential manager using AES-256 (Fernet) encryption. Stores, retrieves, and rotates secrets using a mandatory Master Key. Use for managing API keys, database credentials, and other sensitive tokens.

Security

guardian-wall

3891
from openclaw/skills

Mitigate prompt injection attacks, especially indirect ones from external web content or files. Use this skill when processing untrusted text from the internet, user-uploaded files, or any external source to sanitize content and detect malicious instructions (e.g., "ignore previous instructions", "system override").

Security

SX-security-audit

3891
from openclaw/skills

全方位安全审计技能。检查文件权限、环境变量、依赖漏洞、配置文件、网络端口、Git 安全、Shell 安全、macOS 安全、密钥检测等。支持 CLI 参数、JSON 输出、配置文件。当用户要求"安全检查"、"漏洞扫描"、"权限检查"、"安全审计"时使用此技能。

Security