cairn-ai-pentest
AI-automated penetration testing and general problem-solving system that achieved unique AK (All Killed) in Tencent Cloud Hackathon intelligent penetration challenge
Best use case
cairn-ai-pentest is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
AI-automated penetration testing and general problem-solving system that achieved unique AK (All Killed) in Tencent Cloud Hackathon intelligent penetration challenge
Teams using cairn-ai-pentest should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/cairn-ai-pentest/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How cairn-ai-pentest Compares
| Feature / Agent | cairn-ai-pentest | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
AI-automated penetration testing and general problem-solving system that achieved unique AK (All Killed) in Tencent Cloud Hackathon intelligent penetration challenge
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Cairn AI Automated Penetration Testing System
> Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection.
Cairn is an AI-driven automated penetration testing and general problem-solving framework developed by the Bytex@起零衍迹实验室 team. It achieved the unique "AK" (All Killed / full score) result in the 2nd TCH Tencent Cloud Hackathon Intelligent Penetration Challenge, placing 4th online. The system uses LLM-based agents to autonomously reason about, plan, and execute multi-step security testing tasks.
---
## What Cairn Does
- **Autonomous AI Agent Loop**: Iteratively reasons about a target, selects tools, executes commands, and interprets results
- **Penetration Testing Automation**: Web vulnerability discovery, exploitation, CTF-style challenge solving
- **General Problem Solving**: Extensible to non-security tasks via tool/plugin architecture
- **Multi-step Planning**: Breaks complex objectives into subtasks with memory and context management
- **Tool Integration**: Wraps common pentest tools (nmap, sqlmap, curl, custom scripts) as callable agent actions
---
## Project Status
> ⚠️ Code is still being organized and is expected to be open-sourced soon. The examples below reflect the architecture described in the competition writeup and visible repository structure.
Follow the writeup for architecture details: https://mp.weixin.qq.com/s/DlpEH7bVr0xi0VawPJs3XA
---
## Installation
```bash
# Clone the repository
git clone https://github.com/oritera/Cairn.git
cd Cairn
# Install Python dependencies (expected)
pip install -r requirements.txt
# Or with uv (modern Python tooling)
uv sync
```
### Environment Configuration
Create a `.env` file in the project root:
```env
# LLM Provider (OpenAI-compatible endpoint)
OPENAI_API_KEY=your_api_key_here
OPENAI_BASE_URL=https://api.openai.com/v1
MODEL_NAME=gpt-4o
# OR use a local/alternative provider
# OPENAI_BASE_URL=https://api.deepseek.com/v1
# MODEL_NAME=deepseek-chat
# Agent configuration
MAX_ITERATIONS=30
TIMEOUT_PER_STEP=60
# Target scope (safety guard)
TARGET_SCOPE=192.168.1.0/24
# Logging
LOG_LEVEL=INFO
LOG_FILE=./logs/cairn.log
```
---
## Core Architecture
Cairn follows a **ReAct (Reasoning + Acting)** agent pattern:
```
User Goal
│
▼
┌─────────────────────────────┐
│ Agent Loop │
│ ┌────────────────────────┐ │
│ │ Think (LLM Reasoning) │ │
│ └──────────┬─────────────┘ │
│ │ │
│ ┌──────────▼─────────────┐ │
│ │ Act (Tool Selection) │ │
│ └──────────┬─────────────┘ │
│ │ │
│ ┌──────────▼─────────────┐ │
│ │ Observe (Parse Result)│ │
│ └──────────┬─────────────┘ │
│ │ │
│ (loop until done) │
└─────────────────────────────┘
│
▼
Final Answer / Exploit / Report
```
---
## Key Usage Patterns
### 1. Basic Agent Invocation (Expected CLI)
```bash
# Run against a CTF challenge or target
python cairn.py --target "http://192.168.1.100" --goal "Find and exploit SQL injection to retrieve admin credentials"
# With custom model
python cairn.py --target "http://challenge.example.com" \
--goal "Solve this web CTF challenge and get the flag" \
--model gpt-4o \
--max-iterations 25
# Dry run (plan only, no execution)
python cairn.py --target "http://192.168.1.100" \
--goal "Enumerate all open services" \
--dry-run
```
### 2. Python API Usage (Expected)
```python
from cairn import CairnAgent
from cairn.tools import ToolRegistry
from cairn.config import CairnConfig
# Initialize configuration
config = CairnConfig(
model_name="gpt-4o",
api_key=os.environ["OPENAI_API_KEY"],
base_url=os.environ.get("OPENAI_BASE_URL", "https://api.openai.com/v1"),
max_iterations=30,
target_scope=["192.168.1.0/24"],
)
# Build tool registry
tools = ToolRegistry()
tools.register_defaults() # nmap, curl, sqlmap, ffuf, etc.
# Create and run agent
agent = CairnAgent(config=config, tools=tools)
result = agent.run(
target="http://192.168.1.100",
goal="Find all web vulnerabilities and attempt exploitation",
)
print(result.summary)
print(result.findings)
```
### 3. Custom Tool Registration
```python
from cairn.tools import Tool, ToolResult
class CustomExploitTool(Tool):
name = "custom_exploit"
description = "Exploits a specific vulnerability in target application"
def execute(self, target: str, payload: str, **kwargs) -> ToolResult:
import subprocess
cmd = f"python exploit.py --target {target} --payload '{payload}'"
output = subprocess.run(cmd, shell=True, capture_output=True, text=True)
return ToolResult(
success=output.returncode == 0,
output=output.stdout,
error=output.stderr,
)
# Register with agent
tools.register(CustomExploitTool())
agent = CairnAgent(config=config, tools=tools)
```
### 4. Multi-Phase Penetration Test
```python
from cairn import CairnAgent, Phase
from cairn.pipeline import PentestPipeline
pipeline = PentestPipeline(agent=agent)
# Define phases
pipeline.add_phase(Phase(
name="reconnaissance",
goal="Enumerate all open ports and services on {target}",
))
pipeline.add_phase(Phase(
name="vulnerability_scan",
goal="Based on discovered services, identify exploitable vulnerabilities",
depends_on=["reconnaissance"],
))
pipeline.add_phase(Phase(
name="exploitation",
goal="Exploit identified vulnerabilities and achieve {objective}",
depends_on=["vulnerability_scan"],
))
# Run full pipeline
report = pipeline.run(
target="192.168.1.100",
objective="obtain root shell or read /flag",
)
report.save("./reports/pentest_report.json")
```
---
## Tool Integration Examples
### Built-in Tool Wrappers (Expected)
```python
# nmap integration
from cairn.tools.network import NmapTool
nmap = NmapTool()
result = nmap.execute(target="192.168.1.100", flags="-sV -sC -p-")
# Returns structured service enumeration data
# HTTP request tool
from cairn.tools.web import HTTPTool
http = HTTPTool()
result = http.execute(
url="http://target.com/login",
method="POST",
data={"username": "admin' OR '1'='1", "password": "x"},
follow_redirects=True,
)
# Command execution tool (sandboxed)
from cairn.tools.shell import ShellTool
shell = ShellTool(allowed_commands=["curl", "nmap", "sqlmap", "ffuf"])
result = shell.execute(command="sqlmap -u 'http://target.com/?id=1' --dbs --batch")
```
---
## Agent Memory and Context
```python
from cairn.memory import AgentMemory
# Memory persists findings across agent steps
memory = AgentMemory(
short_term_limit=20, # Recent observations in context
long_term_enabled=True, # Summarize older context
facts_store=True, # Extract and index key facts
)
agent = CairnAgent(config=config, tools=tools, memory=memory)
# Access collected facts after run
for finding in agent.memory.findings:
print(f"[{finding.severity}] {finding.description}")
print(f" Evidence: {finding.evidence}")
print(f" Recommendation: {finding.remediation}")
```
---
## Configuration Reference
```python
# cairn/config.py (expected structure)
@dataclass
class CairnConfig:
# LLM settings
model_name: str = "gpt-4o"
api_key: str = field(default_factory=lambda: os.environ["OPENAI_API_KEY"])
base_url: str = "https://api.openai.com/v1"
temperature: float = 0.1 # Low temp for consistent tool use
max_tokens: int = 4096
# Agent behavior
max_iterations: int = 30 # Hard stop on runaway loops
timeout_per_step: int = 60 # Seconds per tool execution
verbose: bool = False
# Safety
target_scope: list[str] = field(default_factory=list)
dry_run: bool = False # Plan without executing
require_confirmation: bool = False # Interactive approval per step
# Output
report_format: str = "json" # json | markdown | html
report_path: str = "./reports"
```
---
## Prompt Engineering Patterns
Cairn uses structured system prompts for reliable tool invocation:
```python
# Example system prompt structure (inferred from competition writeup)
SYSTEM_PROMPT = """You are an expert penetration tester AI agent.
## Objective
{goal}
## Target
{target}
## Available Tools
{tool_descriptions}
## Rules
1. Always reason step-by-step before acting
2. Stay within scope: {scope}
3. Prefer non-destructive enumeration before exploitation
4. Document every finding with evidence
## Response Format
Thought: <your reasoning>
Action: <tool_name>
Action Input: <tool parameters as JSON>
After receiving Observation, continue until you reach a Final Answer.
"""
```
---
## CTF / Challenge Mode
```bash
# Optimized for CTF flag capture
python cairn.py \
--mode ctf \
--target "http://ctf-challenge.com:8080" \
--goal "Find the hidden flag in format FLAG{...}" \
--model gpt-4o \
--iterations 50 \
--verbose
# With flag pattern matching
python cairn.py \
--mode ctf \
--target "http://target.com" \
--flag-pattern "CTF\{[a-zA-Z0-9_]+\}" \
--auto-submit
```
---
## Logging and Debugging
```python
import logging
from cairn import CairnAgent
# Enable detailed agent trace logging
logging.basicConfig(level=logging.DEBUG)
agent = CairnAgent(config=config, tools=tools, verbose=True)
# Each step is logged:
# [THINK] Analyzing login form for injection points...
# [ACT] Calling tool: http_request
# [INPUT] {"url": "...", "method": "POST", "data": {...}}
# [OBS] Response 200, contains "Invalid credentials"
# [THINK] Response suggests valid injection point, trying UNION...
```
---
## Troubleshooting
| Issue | Cause | Fix |
|-------|-------|-----|
| Agent loops without progress | Goal too vague or tools failing silently | Add `--max-iterations 15`, use `--verbose` to inspect loop |
| Tool execution timeout | Slow network or heavy scan | Increase `TIMEOUT_PER_STEP` in config |
| LLM refuses tool call | Safety filter on model provider | Use a less restrictive model endpoint or rephrase goal |
| Out of context window | Long agent history | Reduce `short_term_limit` or enable memory summarization |
| Scope violation error | Target not in allowed scope | Add target CIDR to `TARGET_SCOPE` in `.env` |
| Empty findings report | Agent completed but found nothing | Check target accessibility, increase iterations |
---
## Responsible Use
Cairn is licensed under **AGPL-3.0**. Usage must comply with:
- ✅ Authorized penetration tests with written permission
- ✅ CTF competitions and intentionally vulnerable lab environments
- ✅ Personal security research on systems you own
- ❌ Unauthorized access to systems you don't own
- ❌ Commercial use without a separate commercial license
Contact the maintainer at the repository for commercial licensing inquiries.
---
## Resources
- **Repository**: https://github.com/oritera/Cairn
- **Competition Writeup**: https://mp.weixin.qq.com/s/DlpEH7bVr0xi0VawPJs3XA
- **License**: AGPL-3.0
- **Team**: Bytex @ 起零衍迹实验室Related Skills
shannon-ai-pentester
Autonomous white-box AI pentester for web applications and APIs using source code analysis and live exploit execution
metatron-pentest-assistant
AI-powered penetration testing assistant using local LLM (metatron-qwen via Ollama) on Parrot OS Linux
```markdown
---
zeroboot-vm-sandbox
Sub-millisecond VM sandboxes for AI agents using copy-on-write KVM forking via Zeroboot
yourvpndead-vpn-detection
Android app that detects VPN/proxy servers (VLESS/xray/sing-box) via local SOCKS5 vulnerability, exposing exit IPs and server configs without root
xata-postgres-platform
Expert skill for Xata open-source cloud-native Postgres platform with copy-on-write branching, scale-to-zero, and Kubernetes deployment
x-mentor-skill-nuwa
AI-powered X (Twitter) content strategy skill that distills methodologies from 6 top creators + open-source algorithm data into actionable writing, growth, and monetization guidance.
wx-favorites-report
End-to-end pipeline to extract, decrypt, and visualize WeChat Mac favorites from encrypted SQLite DB into an interactive HTML report.
wterm-web-terminal
Web terminal emulator with Zig/WASM core, DOM rendering, and React/vanilla JS bindings
worldmonitor-intelligence-dashboard
Real-time global intelligence dashboard with AI-powered news aggregation, geopolitical monitoring, and infrastructure tracking
witr-process-inspector
CLI and TUI tool that explains why processes, services, and ports are running by tracing causality chains across supervisors, containers, and shells.
wildworld-dataset
WildWorld large-scale action-conditioned world modeling dataset with 108M+ frames from a photorealistic ARPG game, featuring per-frame annotations, 450+ actions, and explicit state information for generative world modeling research.