arc-shield
Output sanitization for agent responses - prevents accidental secret leaks
Best use case
arc-shield is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Output sanitization for agent responses - prevents accidental secret leaks
Teams using arc-shield should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/arc-shield/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How arc-shield Compares
| Feature / Agent | arc-shield | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Output sanitization for agent responses - prevents accidental secret leaks
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
SKILL.md Source
# arc-shield
**Output sanitization for agent responses.** Scans ALL outbound messages for leaked secrets, tokens, keys, passwords, and PII before they leave the agent.
⚠️ **This is NOT an input scanner** — `clawdefender` already handles that. This is an **OUTPUT filter** for catching things your agent accidentally includes in its own responses.
## Why You Need This
Agents have access to sensitive data: 1Password vaults, environment variables, config files, wallet keys. Sometimes they accidentally include these in responses when:
- Debugging and showing full command output
- Copying file contents that contain secrets
- Generating code examples with real credentials
- Summarizing logs that include tokens
Arc-shield catches these leaks before they reach Discord, Signal, X, or any external channel.
## What It Detects
### 🔴 CRITICAL (blocks in `--strict` mode)
- **API Keys & Tokens**: 1Password (`ops_*`), GitHub (`ghp_*`), OpenAI (`sk-*`), Stripe, AWS, Bearer tokens
- **Passwords**: Assignments like `password=...` or `passwd: ...`
- **Private Keys**: Ethereum (0x + 64 hex), SSH keys, PGP blocks
- **Wallet Mnemonics**: 12/24 word recovery phrases
- **PII**: Social Security Numbers, credit card numbers
- **Platform Tokens**: Slack, Telegram, Discord
### 🟠 HIGH (warns loudly)
- **High-entropy strings**: Shannon entropy > 4.5 for strings > 16 chars (catches novel secret patterns)
- **Credit cards**: 16-digit card numbers
- **Base64 credentials**: Long base64 strings that look like tokens
### 🟡 WARN (informational)
- **Secret file paths**: `~/.secrets/*`, paths containing "password", "token", "key"
- **Environment variables**: `ENV_VAR=secret_value` exports
- **Database URLs**: Connection strings with credentials
## Installation
```bash
cd ~/.openclaw/workspace/skills
git clone <arc-shield-repo> arc-shield
chmod +x arc-shield/scripts/*.sh arc-shield/scripts/*.py
```
Or download as a skill bundle.
## Usage
### Command-line
```bash
# Scan agent output before sending
agent-response.txt | arc-shield.sh
# Block if critical secrets found (use before external messaging)
echo "Message text" | arc-shield.sh --strict || echo "BLOCKED"
# Redact secrets and return sanitized text
cat response.txt | arc-shield.sh --redact
# Full report
arc-shield.sh --report < conversation.log
# Python version with entropy detection
cat message.txt | output-guard.py --strict
```
### Integration with OpenClaw Agents
#### Pre-send hook (recommended)
Add to your messaging skill or wrapper:
```bash
#!/bin/bash
# send-message.sh wrapper
MESSAGE="$1"
CHANNEL="$2"
# Sanitize output
SANITIZED=$(echo "$MESSAGE" | arc-shield.sh --strict --redact)
EXIT_CODE=$?
if [[ $EXIT_CODE -eq 1 ]]; then
echo "ERROR: Message contains critical secrets and was blocked." >&2
exit 1
fi
# Send sanitized message
openclaw message send --channel "$CHANNEL" "$SANITIZED"
```
#### Manual pipe
Before any external message:
```bash
# Generate response
RESPONSE=$(agent-generate-response)
# Sanitize
CLEAN=$(echo "$RESPONSE" | arc-shield.sh --redact)
# Send
signal send "$CLEAN"
```
### Testing
```bash
cd skills/arc-shield/tests
./run-tests.sh
```
Includes test cases for:
- Real leaked patterns (1Password tokens, Instagram passwords, wallet mnemonics)
- False positive prevention (normal URLs, email addresses, file paths)
- Redaction accuracy
- Strict mode blocking
## Configuration
Patterns are defined in `config/patterns.conf`:
```conf
CRITICAL|GitHub PAT|ghp_[a-zA-Z0-9]{36,}
CRITICAL|OpenAI Key|sk-[a-zA-Z0-9]{20,}
WARN|Secret Path|~\/\.secrets\/[^\s]*
```
Edit to add custom patterns or adjust severity levels.
## Modes
| Mode | Behavior | Exit Code | Use Case |
|------|----------|-----------|----------|
| Default | Pass through + warnings to stderr | 0 | Development, logging |
| `--strict` | Block on CRITICAL findings | 1 if critical | Production outbound messages |
| `--redact` | Replace secrets with `[REDACTED:TYPE]` | 0 | Safe logging, auditing |
| `--report` | Analysis only, no pass-through | 0 | Auditing conversations |
## Entropy Detection
The Python version (`output-guard.py`) includes Shannon entropy analysis to catch secrets that don't match regex patterns:
```python
# Detects high-entropy strings like:
kJ8nM2pQ5rT9vWxY3zA6bC4dE7fG1hI0 # Novel API key format
Zm9vOmJhcg== # Base64 credentials
```
Threshold: **4.5 bits** (configurable with `--entropy-threshold`)
## Performance
- **Bash version**: ~10ms for typical message (< 1KB)
- **Python version**: ~50ms with entropy analysis
- **Zero external dependencies**: bash + Python stdlib only
Fast enough to run on every outbound message without noticeable delay.
## Real-World Catches
From our own agent sessions:
```bash
# 1Password token
"ops_eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9..."
# Instagram password in debug output
"instagram login: user@example.com / MyInsT@Gr4mP4ss!"
# Wallet mnemonic in file listing
"cat ~/.secrets/wallet-recovery-phrase.txt
abandon ability able about above absent absorb abstract..."
# GitHub PAT in git config
"[remote "origin"]
url = https://ghp_abc123:@github.com/user/repo"
```
All blocked by arc-shield before reaching external channels.
## Best Practices
1. **Always use `--strict` for external messages** (Discord, Signal, X, email)
2. **Use `--redact` for logs** you want to review later
3. **Run tests after adding custom patterns** to check for false positives
4. **Pipe through both** bash and Python versions for maximum coverage:
```bash
message | arc-shield.sh --strict | output-guard.py --strict
```
5. **Don't rely on this alone** — educate your agent to avoid including secrets in the first place (see AGENTS.md output sanitization directive)
## Limitations
- **Context-free**: Can't distinguish between "here's my password: X" (bad) and "set your password to X" (instruction)
- **No semantic understanding**: Won't catch "my token is in the previous message"
- **Pattern-based**: New secret formats require pattern updates
Use in combination with agent instructions and careful prompt engineering.
## Integration Example
Full OpenClaw agent integration:
```bash
# In your agent's message wrapper
send_external_message() {
local message="$1"
local channel="$2"
# Pre-flight sanitization
if ! echo "$message" | arc-shield.sh --strict > /dev/null 2>&1; then
echo "ERROR: Message blocked by arc-shield (contains secrets)" >&2
return 1
fi
# Double-check with entropy detection
if ! echo "$message" | output-guard.py --strict > /dev/null 2>&1; then
echo "ERROR: High-entropy secret detected" >&2
return 1
fi
# Safe to send
openclaw message send --channel "$channel" "$message"
}
```
## Troubleshooting
**False positives on normal text:**
- Adjust entropy threshold: `output-guard.py --entropy-threshold 5.0`
- Edit `config/patterns.conf` to refine regex patterns
- Add exceptions to the pattern file
**Secrets not detected:**
- Check pattern file for coverage
- Run with `--report` to see what's being scanned
- Test with `tests/run-tests.sh` using your sample
- Consider lowering entropy threshold (but watch for false positives)
**Performance issues:**
- Use bash version only (skip entropy detection)
- Limit input size with `head -c 10000`
- Run in background: `arc-shield.sh --report &`
## Contributing
Add new patterns to `config/patterns.conf` following the format:
```
SEVERITY|Category Name|regex_pattern
```
Test with `tests/run-tests.sh` before deploying.
## License
MIT — use freely, protect your secrets.
---
**Remember**: Arc-shield is your safety net, not your strategy. Train your agent to never include secrets in responses. This tool catches mistakes, not malice.Related Skills
agentshield
Trust Infrastructure for AI Agents - Like SSL/TLS for agent-to-agent communication. 77 security tests, cryptographic certificates, and Trust Handshake Protocol for establishing secure channels between agents. Explicit whitelist sanitization + dry-run mode for transparency.
ggshield-scanner
Detect 500+ types of hardcoded secrets (API keys, credentials, tokens) before they leak into git. Wraps GitGuardian's ggshield CLI.
shieldapi
ShieldAPI — x402 Security Intelligence for AI Agents. 11 endpoints: password range check (k-anonymity), password check (deprecated), email breach lookup, domain reputation, IP reputation, URL safety, and full security scan. Pay-per-request with USDC micropayments ($0.001-$0.01). No account, no API key. Demo mode available.
ClawSkillShield 🛡️
**Local-first security scanner for OpenClaw/ClawHub skills.**
---
name: article-factory-wechat
humanizer
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.
find-skills
Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.
tavily-search
Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the web. Requires Tavily API key.
baidu-search
Search the web using Baidu AI Search Engine (BDSE). Use for live information, documentation, or research topics.
agent-autonomy-kit
Stop waiting for prompts. Keep working.
Meeting Prep
Never walk into a meeting unprepared again. Your agent researches all attendees before calendar events—pulling LinkedIn profiles, recent company news, mutual connections, and conversation starters. Generates a briefing doc with talking points, icebreakers, and context so you show up informed and confident. Triggered automatically before meetings or on-demand. Configure research depth, advance timing, and output format. Walking into meetings blind is amateur hour—missed connections, generic small talk, zero leverage. Use when setting up meeting intelligence, researching specific attendees, generating pre-meeting briefs, or automating your prep workflow.
self-improvement
Captures learnings, errors, and corrections to enable continuous improvement. Use when: (1) A command or operation fails unexpectedly, (2) User corrects Claude ('No, that's wrong...', 'Actually...'), (3) User requests a capability that doesn't exist, (4) An external API or tool fails, (5) Claude realizes its knowledge is outdated or incorrect, (6) A better approach is discovered for a recurring task. Also review learnings before major tasks.