tap-explorer

Tree of Attacks with Pruning for systematic code analysis

108 stars

Best use case

tap-explorer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Tree of Attacks with Pruning for systematic code analysis

Teams using tap-explorer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/tap-explorer/SKILL.md --create-dirs "https://raw.githubusercontent.com/alfredolopez80/multi-agent-ralph-loop/main/.claude/skills/tap-explorer/skill.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/tap-explorer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How tap-explorer Compares

Feature / Agenttap-explorerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Tree of Attacks with Pruning for systematic code analysis

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# TAP Explorer

## v2.88 Key Changes (MODEL-AGNOSTIC)

- **Model-agnostic**: Uses model configured in `~/.claude/settings.json` or CLI/env vars
- **No flags required**: Works with the configured default model
- **Flexible**: Works with GLM-5, Claude, Minimax, or any configured model
- **Settings-driven**: Model selection via `ANTHROPIC_DEFAULT_*_MODEL` env vars

**Tree of Attacks with Pruning** exploration pattern for systematic code analysis.

Inspired by ZeroLeaks TAP methodology: systematic exploration of solution/test vectors with scoring and pruning for optimal coverage.

## Core Concept

TAP (Tree of Attacks with Pruning) provides a structured way to explore multiple analysis paths simultaneously, pruning low-value branches to focus resources on promising vectors.

```
                    ROOT
                   /    \
                  /      \
            Node A        Node B
           (0.8)          (0.3) ← PRUNED
          /      \
     Node C      Node D
     (0.7)       (0.6)
       |
    Node E
    (0.9) ← SUCCESS
```

## Usage

```bash
/tap-explore "Find all security vulnerabilities in auth module"
/tap-explore --depth 5 --branches 4 "Optimize database queries"
/tap-explore --prune 0.4 "Refactor legacy code patterns"
```

## Configuration

```yaml
tap_config:
  max_tree_depth: 5       # Maximum depth to explore
  branching_factor: 4     # Candidates per node
  pruning_threshold: 0.3  # Score below which to prune

  scoring:
    effectiveness_weight: 0.5  # How likely to succeed
    stealth_weight: 0.3        # How elegant/minimal
    novelty_weight: 0.2        # Avoid repeated patterns
```

## Algorithm

### 1. Candidate Generation

At each node, generate N candidates:

```python
def generate_candidates(context, n=4):
    """
    Generate candidate exploration paths.

    Args:
        context: Current state (history, findings, profile)
        n: Number of candidates to generate

    Returns:
        List of scored candidates
    """
    candidates = []

    for i in range(n):
        candidate = {
            "prompt": generate_exploration_prompt(context),
            "technique": select_technique(context),
            "category": select_category(context),
            "expected_effectiveness": estimate_effectiveness(),
            "stealthiness": estimate_elegance(),
            "reasoning": explain_choice()
        }
        candidates.append(candidate)

    return candidates
```

### 2. Scoring

Each candidate is scored on multiple dimensions:

```python
def score_candidate(candidate, profile):
    """
    Score a candidate exploration path.

    Formula:
    score = (effectiveness * 0.5) +
            (stealth * 0.3) +
            (novelty * 0.2)
    """
    effectiveness = candidate.expected_effectiveness

    # Adjust for defense level
    if profile.level in ["strong", "hardened"]:
        effectiveness *= 0.7

    novelty = calculate_novelty(candidate)

    return (
        effectiveness * 0.5 +
        candidate.stealthiness * 0.3 +
        novelty * 0.2
    )
```

### 3. Pruning

Low-scoring branches are pruned:

```python
def prune_candidates(candidates, threshold=0.3):
    """
    Remove low-value candidates.

    Args:
        candidates: Scored candidates list
        threshold: Minimum score to keep

    Returns:
        Filtered candidates
    """
    return [c for c in candidates if c.final_score >= threshold]
```

### 4. Tree Update

After each exploration, update the tree:

```python
def update_tree(node, response, success):
    """
    Update node with exploration result.

    Args:
        node: Current node
        response: Result of exploration
        success: Whether exploration succeeded
    """
    node.executed = True
    node.response = response
    node.posterior_score = 1.0 if success else 0.2

    # Track consecutive failures for reset
    if not success:
        tree.consecutive_failures += 1
    else:
        tree.consecutive_failures = 0
```

## Node Structure

```typescript
interface ExplorationNode {
  id: string;
  parentId: string | null;
  depth: number;

  // Exploration details
  prompt: string;
  technique: string;
  category: string;

  // State
  executed: boolean;
  response?: string;

  // Scoring
  priorScore: number;      // Expected before execution
  posteriorScore: number;  // Actual after execution

  // Children
  children: ExplorationNode[];

  // Metadata
  reasoning?: string;
  timestamp: number;
}
```

## Exploration Strategies

### Depth-First with Pruning
```yaml
strategy: depth_first_prune
description: Explore deep on promising paths, prune failures
behavior:
  - Follow highest-scoring child
  - Prune if score drops below threshold
  - Backtrack to next-best sibling
```

### Breadth-First with Selection
```yaml
strategy: breadth_first_select
description: Explore all children, select best for next level
behavior:
  - Generate all candidates at current level
  - Score and rank
  - Select top N for next level
```

### Adaptive Exploration
```yaml
strategy: adaptive
description: Switch strategies based on results
behavior:
  - Start breadth-first for reconnaissance
  - Switch to depth-first on promising vectors
  - Reset and try new angle after consecutive failures
```

## Reset Logic

Know when to abandon and restart:

```python
def should_reset():
    """
    Determine if exploration should reset.

    Returns:
        (should_reset, reason)
    """
    # Too many consecutive failures
    if tree.consecutive_failures >= 5:
        return True, "5+ consecutive failures detected"

    # Identical responses (stuck)
    recent = get_recent_responses(3)
    if all_identical(recent):
        return True, "Identical responses - need fresh approach"

    # Depth exceeded without progress
    if tree.max_depth > 4 and tree.success_count == 0:
        return True, "Deep exploration without success"

    return False, None
```

## Integration with Ralph Loop

TAP Explorer integrates at Step 6 (EXECUTE-WITH-SYNC):

```yaml
Step 6: EXECUTE-WITH-SYNC
  └── For each step:
      └── 6a. LSA-VERIFY
      └── 6b. IMPLEMENT
          └── TAP-EXPLORE (for complex implementations)
      └── 6c. PLAN-SYNC
      └── 6d. MICRO-GATE
```

### Invocation

```yaml
Task:
  subagent_type: "tap-explorer"
  model: "sonnet"
  prompt: |
    GOAL: "Find optimal solution for authentication refactor"
    CONFIG:
      max_depth: 5
      branching: 4
      prune_threshold: 0.3
      strategy: adaptive

    CONTEXT:
      current_code: src/auth/
      constraints: ["maintain API compatibility", "improve performance"]
```

## Output Format

```json
{
  "exploration_result": {
    "best_path": [
      {"node": "root", "score": 1.0},
      {"node": "node_a", "score": 0.85},
      {"node": "node_c", "score": 0.78},
      {"node": "node_e", "score": 0.92}
    ],
    "total_nodes_explored": 23,
    "max_depth_reached": 4,
    "successful_paths": 3,
    "pruned_branches": 8
  },
  "findings": [
    {
      "path": "root → a → c → e",
      "technique": "dependency_injection",
      "confidence": "high",
      "recommendation": "Implement DI for auth service"
    }
  ],
  "tree_visualization": "..."
}
```

## Novelty Calculation

Avoid repeating the same approaches:

```python
def calculate_novelty(candidate):
    """
    Calculate how novel this candidate is.

    Higher novelty = less similar to previous attempts
    """
    if not explored_nodes:
        return 1.0  # First candidate is fully novel

    previous_prompts = [n.prompt for n in explored_nodes]

    max_similarity = 0
    for prev in previous_prompts:
        similarity = jaccard_similarity(candidate.prompt, prev)
        max_similarity = max(max_similarity, similarity)

    return 1 - max_similarity


def jaccard_similarity(a, b):
    """Word-level Jaccard similarity."""
    words_a = set(a.lower().split())
    words_b = set(b.lower().split())

    intersection = len(words_a & words_b)
    union = len(words_a | words_b)

    return intersection / union if union > 0 else 0
```

## CLI Commands

```bash
# Basic exploration
ralph tap-explore "Optimize database layer"

# With configuration
ralph tap-explore --depth 6 --branches 5 "Security audit"

# With specific strategy
ralph tap-explore --strategy depth_first "Find memory leaks"

# Export tree visualization
ralph tap-explore "Analysis" --visualize tree.svg
```

## Visualization

```
TAP Exploration Tree
====================

ROOT: "Analyze auth module"
├── [0.85] Pattern Analysis
│   ├── [0.78] Token Validation
│   │   └── [0.92] JWT Verification ★ SUCCESS
│   └── [0.45] Session Handling ← PRUNED
├── [0.72] Dependency Review
│   └── [0.68] Third-party Audit
└── [0.28] Config Analysis ← PRUNED

Legend: [score] technique  ★=success  ←PRUNED=below threshold
```

## Best Practices

1. **Start Broad**: Begin with high branching factor for reconnaissance
2. **Prune Aggressively**: Low threshold (0.3) saves resources
3. **Track Novelty**: Avoid repeating failed approaches
4. **Reset Smartly**: Don't persist on stuck paths
5. **Learn from Success**: Successful paths inform future exploration

## Attribution

TAP pattern adapted from [ZeroLeaks](https://github.com/ZeroLeaks/zeroleaks) Tree of Attacks with Pruning methodology (FSL-1.1-Apache-2.0).

Related Skills

worktree-pr

108
from alfredolopez80/multi-agent-ralph-loop

Manage git worktrees with PR workflow and multi-agent review (Claude + Codex). Use when developing features in isolation with easy rollback.

vercel-react-best-practices

108
from alfredolopez80/multi-agent-ralph-loop

React and Next.js performance optimization guidelines from Vercel Engineering. Use when writing, reviewing, or refactoring React/Next.js code. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.

vault

108
from alfredolopez80/multi-agent-ralph-loop

Living knowledge base management. Actions: search (query vault), save (store learning), index (update indices), compile (raw->wiki->rules graduation), init (create vault structure). Follows Karpathy pipeline: ingest->compile->query. Use when: (1) searching accumulated knowledge, (2) saving learnings, (3) compiling raw notes into wiki, (4) initializing a new vault. Triggers: /vault, 'vault search', 'knowledge base', 'save learning'.

testing-anti-patterns

108
from alfredolopez80/multi-agent-ralph-loop

Custom skill for testing-anti-patterns

task-visualizer

108
from alfredolopez80/multi-agent-ralph-loop

Visualize task dependencies and progress (Gastown-style)

task-classifier

108
from alfredolopez80/multi-agent-ralph-loop

Classifies task complexity (1-10) for model and agent routing

task-batch

108
from alfredolopez80/multi-agent-ralph-loop

Autonomous batch task execution with PRD parsing, task decomposition, and continuous execution until all tasks complete. Uses /orchestrator internally. Stops only for major failures (no internet, token limit, system crash). Use when: (1) processing task lists autonomously, (2) PRD-driven development, (3) batch feature implementation. Triggers: /task-batch, 'batch tasks', 'process PRD', 'run task queue'.

stop-slop

108
from alfredolopez80/multi-agent-ralph-loop

A skill for removing AI-generated writing patterns ('slop') from prose. Eliminates telltale signs of AI writing like filler phrases, excessive hedging, overly formal language, and mechanical sentence structures. Use when: writing content that should sound human and natural, editing AI-generated drafts, cleaning up prose for publication, or any content that needs to sound authentic rather than AI-generated. Triggers: 'stop-slop', 'remove AI tells', 'clean up prose', 'make it sound human', 'edit AI writing'.

spec

108
from alfredolopez80/multi-agent-ralph-loop

Produce a verifiable technical specification before coding. 6 mandatory sections: Interfaces, Behaviors, Invariants (from Aristotle Phase 2), File Plan, Test Plan, Exit Criteria (executable bash commands + expected results). Use when: (1) before implementing features with complexity > 4, (2) as Step 1.5 in orchestrator workflow, (3) when requirements need formalization. Triggers: /spec, 'create spec', 'write specification', 'technical spec'.

smart-fork

108
from alfredolopez80/multi-agent-ralph-loop

Smart Forking - Find and fork from relevant historical sessions using parallel memory search across vault, memvid, handoffs, and ledgers

ship

108
from alfredolopez80/multi-agent-ralph-loop

Pre-launch shipping checklist orchestrating /gates, /security, /browser-test, /perf. Ensures nothing ships without passing all quality checks. Use when: (1) before deploying, (2) before merging to main, (3) before release. Triggers: /ship, 'ship it', 'ready to deploy', 'pre-launch check'.

senior-software-engineer

108
from alfredolopez80/multi-agent-ralph-loop

Global skill enforcing senior software engineering best practices