tap-explorer
Tree of Attacks with Pruning for systematic code analysis
Best use case
tap-explorer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Tree of Attacks with Pruning for systematic code analysis
Teams using tap-explorer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/tap-explorer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How tap-explorer Compares
| Feature / Agent | tap-explorer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Tree of Attacks with Pruning for systematic code analysis
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# TAP Explorer
## v2.88 Key Changes (MODEL-AGNOSTIC)
- **Model-agnostic**: Uses model configured in `~/.claude/settings.json` or CLI/env vars
- **No flags required**: Works with the configured default model
- **Flexible**: Works with GLM-5, Claude, Minimax, or any configured model
- **Settings-driven**: Model selection via `ANTHROPIC_DEFAULT_*_MODEL` env vars
**Tree of Attacks with Pruning** exploration pattern for systematic code analysis.
Inspired by ZeroLeaks TAP methodology: systematic exploration of solution/test vectors with scoring and pruning for optimal coverage.
## Core Concept
TAP (Tree of Attacks with Pruning) provides a structured way to explore multiple analysis paths simultaneously, pruning low-value branches to focus resources on promising vectors.
```
ROOT
/ \
/ \
Node A Node B
(0.8) (0.3) ← PRUNED
/ \
Node C Node D
(0.7) (0.6)
|
Node E
(0.9) ← SUCCESS
```
## Usage
```bash
/tap-explore "Find all security vulnerabilities in auth module"
/tap-explore --depth 5 --branches 4 "Optimize database queries"
/tap-explore --prune 0.4 "Refactor legacy code patterns"
```
## Configuration
```yaml
tap_config:
max_tree_depth: 5 # Maximum depth to explore
branching_factor: 4 # Candidates per node
pruning_threshold: 0.3 # Score below which to prune
scoring:
effectiveness_weight: 0.5 # How likely to succeed
stealth_weight: 0.3 # How elegant/minimal
novelty_weight: 0.2 # Avoid repeated patterns
```
## Algorithm
### 1. Candidate Generation
At each node, generate N candidates:
```python
def generate_candidates(context, n=4):
"""
Generate candidate exploration paths.
Args:
context: Current state (history, findings, profile)
n: Number of candidates to generate
Returns:
List of scored candidates
"""
candidates = []
for i in range(n):
candidate = {
"prompt": generate_exploration_prompt(context),
"technique": select_technique(context),
"category": select_category(context),
"expected_effectiveness": estimate_effectiveness(),
"stealthiness": estimate_elegance(),
"reasoning": explain_choice()
}
candidates.append(candidate)
return candidates
```
### 2. Scoring
Each candidate is scored on multiple dimensions:
```python
def score_candidate(candidate, profile):
"""
Score a candidate exploration path.
Formula:
score = (effectiveness * 0.5) +
(stealth * 0.3) +
(novelty * 0.2)
"""
effectiveness = candidate.expected_effectiveness
# Adjust for defense level
if profile.level in ["strong", "hardened"]:
effectiveness *= 0.7
novelty = calculate_novelty(candidate)
return (
effectiveness * 0.5 +
candidate.stealthiness * 0.3 +
novelty * 0.2
)
```
### 3. Pruning
Low-scoring branches are pruned:
```python
def prune_candidates(candidates, threshold=0.3):
"""
Remove low-value candidates.
Args:
candidates: Scored candidates list
threshold: Minimum score to keep
Returns:
Filtered candidates
"""
return [c for c in candidates if c.final_score >= threshold]
```
### 4. Tree Update
After each exploration, update the tree:
```python
def update_tree(node, response, success):
"""
Update node with exploration result.
Args:
node: Current node
response: Result of exploration
success: Whether exploration succeeded
"""
node.executed = True
node.response = response
node.posterior_score = 1.0 if success else 0.2
# Track consecutive failures for reset
if not success:
tree.consecutive_failures += 1
else:
tree.consecutive_failures = 0
```
## Node Structure
```typescript
interface ExplorationNode {
id: string;
parentId: string | null;
depth: number;
// Exploration details
prompt: string;
technique: string;
category: string;
// State
executed: boolean;
response?: string;
// Scoring
priorScore: number; // Expected before execution
posteriorScore: number; // Actual after execution
// Children
children: ExplorationNode[];
// Metadata
reasoning?: string;
timestamp: number;
}
```
## Exploration Strategies
### Depth-First with Pruning
```yaml
strategy: depth_first_prune
description: Explore deep on promising paths, prune failures
behavior:
- Follow highest-scoring child
- Prune if score drops below threshold
- Backtrack to next-best sibling
```
### Breadth-First with Selection
```yaml
strategy: breadth_first_select
description: Explore all children, select best for next level
behavior:
- Generate all candidates at current level
- Score and rank
- Select top N for next level
```
### Adaptive Exploration
```yaml
strategy: adaptive
description: Switch strategies based on results
behavior:
- Start breadth-first for reconnaissance
- Switch to depth-first on promising vectors
- Reset and try new angle after consecutive failures
```
## Reset Logic
Know when to abandon and restart:
```python
def should_reset():
"""
Determine if exploration should reset.
Returns:
(should_reset, reason)
"""
# Too many consecutive failures
if tree.consecutive_failures >= 5:
return True, "5+ consecutive failures detected"
# Identical responses (stuck)
recent = get_recent_responses(3)
if all_identical(recent):
return True, "Identical responses - need fresh approach"
# Depth exceeded without progress
if tree.max_depth > 4 and tree.success_count == 0:
return True, "Deep exploration without success"
return False, None
```
## Integration with Ralph Loop
TAP Explorer integrates at Step 6 (EXECUTE-WITH-SYNC):
```yaml
Step 6: EXECUTE-WITH-SYNC
└── For each step:
└── 6a. LSA-VERIFY
└── 6b. IMPLEMENT
└── TAP-EXPLORE (for complex implementations)
└── 6c. PLAN-SYNC
└── 6d. MICRO-GATE
```
### Invocation
```yaml
Task:
subagent_type: "tap-explorer"
model: "sonnet"
prompt: |
GOAL: "Find optimal solution for authentication refactor"
CONFIG:
max_depth: 5
branching: 4
prune_threshold: 0.3
strategy: adaptive
CONTEXT:
current_code: src/auth/
constraints: ["maintain API compatibility", "improve performance"]
```
## Output Format
```json
{
"exploration_result": {
"best_path": [
{"node": "root", "score": 1.0},
{"node": "node_a", "score": 0.85},
{"node": "node_c", "score": 0.78},
{"node": "node_e", "score": 0.92}
],
"total_nodes_explored": 23,
"max_depth_reached": 4,
"successful_paths": 3,
"pruned_branches": 8
},
"findings": [
{
"path": "root → a → c → e",
"technique": "dependency_injection",
"confidence": "high",
"recommendation": "Implement DI for auth service"
}
],
"tree_visualization": "..."
}
```
## Novelty Calculation
Avoid repeating the same approaches:
```python
def calculate_novelty(candidate):
"""
Calculate how novel this candidate is.
Higher novelty = less similar to previous attempts
"""
if not explored_nodes:
return 1.0 # First candidate is fully novel
previous_prompts = [n.prompt for n in explored_nodes]
max_similarity = 0
for prev in previous_prompts:
similarity = jaccard_similarity(candidate.prompt, prev)
max_similarity = max(max_similarity, similarity)
return 1 - max_similarity
def jaccard_similarity(a, b):
"""Word-level Jaccard similarity."""
words_a = set(a.lower().split())
words_b = set(b.lower().split())
intersection = len(words_a & words_b)
union = len(words_a | words_b)
return intersection / union if union > 0 else 0
```
## CLI Commands
```bash
# Basic exploration
ralph tap-explore "Optimize database layer"
# With configuration
ralph tap-explore --depth 6 --branches 5 "Security audit"
# With specific strategy
ralph tap-explore --strategy depth_first "Find memory leaks"
# Export tree visualization
ralph tap-explore "Analysis" --visualize tree.svg
```
## Visualization
```
TAP Exploration Tree
====================
ROOT: "Analyze auth module"
├── [0.85] Pattern Analysis
│ ├── [0.78] Token Validation
│ │ └── [0.92] JWT Verification ★ SUCCESS
│ └── [0.45] Session Handling ← PRUNED
├── [0.72] Dependency Review
│ └── [0.68] Third-party Audit
└── [0.28] Config Analysis ← PRUNED
Legend: [score] technique ★=success ←PRUNED=below threshold
```
## Best Practices
1. **Start Broad**: Begin with high branching factor for reconnaissance
2. **Prune Aggressively**: Low threshold (0.3) saves resources
3. **Track Novelty**: Avoid repeating failed approaches
4. **Reset Smartly**: Don't persist on stuck paths
5. **Learn from Success**: Successful paths inform future exploration
## Attribution
TAP pattern adapted from [ZeroLeaks](https://github.com/ZeroLeaks/zeroleaks) Tree of Attacks with Pruning methodology (FSL-1.1-Apache-2.0).Related Skills
worktree-pr
Manage git worktrees with PR workflow and multi-agent review (Claude + Codex). Use when developing features in isolation with easy rollback.
vercel-react-best-practices
React and Next.js performance optimization guidelines from Vercel Engineering. Use when writing, reviewing, or refactoring React/Next.js code. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.
vault
Living knowledge base management. Actions: search (query vault), save (store learning), index (update indices), compile (raw->wiki->rules graduation), init (create vault structure). Follows Karpathy pipeline: ingest->compile->query. Use when: (1) searching accumulated knowledge, (2) saving learnings, (3) compiling raw notes into wiki, (4) initializing a new vault. Triggers: /vault, 'vault search', 'knowledge base', 'save learning'.
testing-anti-patterns
Custom skill for testing-anti-patterns
task-visualizer
Visualize task dependencies and progress (Gastown-style)
task-classifier
Classifies task complexity (1-10) for model and agent routing
task-batch
Autonomous batch task execution with PRD parsing, task decomposition, and continuous execution until all tasks complete. Uses /orchestrator internally. Stops only for major failures (no internet, token limit, system crash). Use when: (1) processing task lists autonomously, (2) PRD-driven development, (3) batch feature implementation. Triggers: /task-batch, 'batch tasks', 'process PRD', 'run task queue'.
stop-slop
A skill for removing AI-generated writing patterns ('slop') from prose. Eliminates telltale signs of AI writing like filler phrases, excessive hedging, overly formal language, and mechanical sentence structures. Use when: writing content that should sound human and natural, editing AI-generated drafts, cleaning up prose for publication, or any content that needs to sound authentic rather than AI-generated. Triggers: 'stop-slop', 'remove AI tells', 'clean up prose', 'make it sound human', 'edit AI writing'.
spec
Produce a verifiable technical specification before coding. 6 mandatory sections: Interfaces, Behaviors, Invariants (from Aristotle Phase 2), File Plan, Test Plan, Exit Criteria (executable bash commands + expected results). Use when: (1) before implementing features with complexity > 4, (2) as Step 1.5 in orchestrator workflow, (3) when requirements need formalization. Triggers: /spec, 'create spec', 'write specification', 'technical spec'.
smart-fork
Smart Forking - Find and fork from relevant historical sessions using parallel memory search across vault, memvid, handoffs, and ledgers
ship
Pre-launch shipping checklist orchestrating /gates, /security, /browser-test, /perf. Ensures nothing ships without passing all quality checks. Use when: (1) before deploying, (2) before merging to main, (3) before release. Triggers: /ship, 'ship it', 'ready to deploy', 'pre-launch check'.
senior-software-engineer
Global skill enforcing senior software engineering best practices