autonomous-builder

Full-stack autonomous software development agent: handles design, implementation, testing, and deployment end-to-end. Auto-activates for project creation, feature development, bug fixing, code refactoring, or when user requests 'build', 'create', 'implement', 'develop', 'fix', or 'refactor' any software project.
1,174 stars
byforyourhealth111-pixel
View on GitHub Installation ↓
Best use case

autonomous-builder is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Teams using autonomous-builder should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.
Installation

Claude Code / Cursor / Codex
$curl -o ~/.claude/skills/autonomous-builder/SKILL.md --create-dirs "https://raw.githubusercontent.com/foryourhealth111-pixel/Vibe-Skills/main/bundled/skills/autonomous-builder/SKILL.md"
Manual Installation
Download SKILL.md from GitHub
Place it in .claude/skills/autonomous-builder/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill
How autonomous-builder Compares

Feature / Agent	autonomous-builder	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A
Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.
Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
SKILL.md Source

# Autonomous Builder

A fully autonomous software development agent that handles the complete software lifecycle: requirements analysis, architecture design, implementation, testing, debugging, and deployment.

## Architecture Pattern: Two-Agent Model

**Based on Anthropic's official claude-quickstarts architecture**

```
┌─────────────────────────────────────────────────────────────────┐
│                 TWO-AGENT ARCHITECTURE                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  SESSION 1: INITIALIZER AGENT                                   │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ • Read requirements / spec                               │    │
│  │ • Create project structure                               │    │
│  │ • Generate feature_list.json (200+ tests)                │    │
│  │ • Initialize Git repository                              │    │
│  │ • ✨ Prompt for GitHub URL (optional)                    │    │
│  │ • ✨ Create README.md & PLANNING.md                      │    │
│  │ • Commit initial state                                   │    │
│  │ • ✨ Push to GitHub & create issues                      │    │
│  └─────────────────────────────────────────────────────────┘    │
│                              │                                   │
│                    feature_list.json                             │
│                    (Single Source of Truth)                      │
│                              │                                   │
│  SESSIONS 2+: BUILDER AGENT (fresh context each session)        │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │ Step 1: Get Context (pwd, ls, git log, progress)         │    │
│  │ Step 2: Start/verify server                              │    │
│  │ Step 3: Verify previous tests (regression check)         │    │
│  │ Step 4: Select next "passes": false feature              │    │
│  │ Step 5: Implement feature                                │    │
│  │ Step 6: Browser automation test                          │    │
│  │ Step 7: Update feature_list.json                         │    │
│  │ Step 8: Generate workflow report                         │    │
│  │ Step 9: Git commit + GitHub push                        │    │
│  │ Step 10: Update progress notes                           │    │
│  │ Step 11: Clean exit (auto-continue in 3s)                │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘
```

**Key Design Principles (Official Pattern):**
1. **Fresh Context Per Session** - Each session uses brand new context window
2. **File-Based State Persistence** - Progress via feature_list.json, not context
3. **Git Commit as State Anchor** - Atomic progress units with easy rollback
4. **Browser Automation Testing** - Act like human user, verify via UI
5. **Auto-Continue with Delay** - 3 second delay between sessions

## Core Philosophy

**The Autonomous Development Loop:**

```
PLAN -> BUILD -> TEST -> DEBUG -> DEPLOY -> (REPEAT)
  |                                    |
  +------------------------------------+
```

**Key Principles:**
1. **Self-Sufficient**: No user intervention required during execution
2. **State-Persistent**: Recovers from interruptions via `.builder/` state files
3. **Multi-Language**: Auto-detects and adapts to project technology stack
4. **Incremental**: Completes one feature at a time, commits progress
5. **Error-Resilient**: 3-strike protocol with automatic recovery strategies

## When to Use This Skill

Trigger when:
- User requests "build", "create", "implement", "develop" any software
- User wants to "fix" or "debug" a project
- User needs "refactoring" or "modernization"
- Starting a new project from requirements
- Continuing an incomplete project

## Not For / Boundaries

- **Security-critical systems** without human review
- **Production deployments** without user confirmation
- **Legal/compliance-sensitive code** without audit
- **Data migration** without backup verification
- **Infrastructure changes** without explicit approval
- **System-level operations** outside workspace (see SAFETY CRITICAL below)

**Required inputs (ask if missing):**
1. Project requirements or specification
2. Target platform/environment (web, CLI, mobile, etc.)
3. Preferred language/framework (or auto-detect)

**Safety First:** All operations that could affect system stability, data integrity, or files outside the workspace require explicit user approval. See **SAFETY CRITICAL** section below for details.

## Quick Reference

### Session Continuity (Auto-Resume)

**⚠️ Critical for Unattended Long-Running Operation**

```
AUTO-RESUME PROTOCOL:
┌─────────────────────────────────────────────────────────────────┐
│  Session Start                                                  │
│       │                                                         │
│       ▼                                                         │
│  Check .builder/state.json exists?                              │
│       │                                                         │
│       ├─ NO → Initialize new project                            │
│       │                                                         │
│       └─ YES → Resume from saved state:                         │
│              1. Read current_phase                               │
│              2. Read current_feature                             │
│              3. Read pending_features[]                          │
│              4. Continue from last checkpoint                    │
│                                                                 │
│  After each feature completion:                                 │
│       │                                                         │
│       ▼                                                         │
│  More pending features?                                         │
│       │                                                         │
│       ├─ YES → Auto-start next feature (NO user input needed)   │
│       │                                                         │
│       └─ NO → All complete! Generate report                     │
└─────────────────────────────────────────────────────────────────┘
```

**Auto-Continue Rules:**

| Condition | Action | User Input Required |
|-----------|--------|---------------------|
| Feature completed, more pending | Auto-start next | **NO** |
| Error recovered successfully | Continue current | **NO** |
| 3-strike error failed | Skip and continue | **NO** (unless critical) |
| Loop detected & resolved | Resume from checkpoint | **NO** |
| All features complete | Generate final report | **NO** |

**State Persistence After Each Operation:**

```json
{
  "auto_continue": true,
  "resume_token": "feat-003-phase-implement",
  "next_action": "Continue implementing feat-003",
  "features_remaining": 3,
  "estimated_completion": "2026-02-14T18:00:00Z"
}
```

### Automatic Task Queue

```python
# After completing a feature, automatically proceed:

def on_feature_complete(feature_id: str, state: ProjectState):
    """Called when a feature is marked complete."""

    # 1. Save checkpoint
    save_checkpoint(state, feature_id)

    # 2. Update feature status
    state.features[feature_id].status = "completed"
    state.features[feature_id].completed_at = datetime.now()

    # 3. Check for pending features
    pending = [f for f in state.features if f.status == "pending"]

    if pending:
        # 4. Auto-select next feature (NO user input)
        next_feature = select_next_feature(pending, state)
        state.current_feature = next_feature.id
        state.current_phase = "implement"

        # 5. Save state immediately
        save_state(state)

        # 6. LOG and CONTINUE (not ask user)
        log_progress(f"Auto-continuing to {next_feature.name}")
        return ContinueAction(feature=next_feature)
    else:
        # All complete!
        return CompleteAction(report=generate_final_report(state))
```

**Resume Message on Session Start:**

```markdown
## 🔄 Session Resume Detected

**Previous Session**: Session #5
**Last Activity**: 2 hours ago
**Current Feature**: feat-003 (User Authentication)
**Phase**: implement (60% complete)

**Pending Features**: 3 remaining
- feat-004: API Rate Limiting
- feat-005: Email Notifications
- feat-006: Final Documentation

**Auto-Continuing**: Resuming feat-003 implementation...

[Proceeding without user input - type "pause" to stop]
```

### Directory Structure

```
.builder/
├── state.json           # Current project state
├── features.json        # Feature list with status
├── architecture.md      # Design decisions
├── progress.md          # Session log
├── errors.json          # Error history and resolutions
├── checkpoints/         # Recovery checkpoints
├── auto-continue.{sh,bat,ps1}  # Auto-restart script (auto-generated)
└── supervisor.json      # Self-supervision config
```

### Skill Scheduling & Auto-Dispatch

**⚠️ Enables automatic skill discovery and invocation**

```markdown
ON PROJECT INITIALIZATION:

1. Check for Claude_Skills_中文指南.md in workspace root
2. If found:
   - Read and parse skill catalog
   - Store available skills in state.json
3. For each feature:
   - Analyze feature requirements
   - Match against skill catalog
   - Add recommended_skills to feature definition

DURING IMPLEMENTATION:

1. Before each implementation step:
   - Check step's invoke_skill field
   - Or analyze step for skill match

2. Auto-invoke skill:
   - Use Skill tool with skill: {skill_name}
   - Execute with skill's guidance
   - Continue with enhanced capabilities

3. Log skill usage to state.json
```

**Task-to-Skill Mapping (Auto-applied):**

| Task Type | Auto-Invoked Skills |
|-----------|---------------------|
| Code review | `code-reviewer`, `code-review-excellence` |
| Data analysis | `exploratory-data-analysis`, `statistical-analysis` |
| Visualization | `data-artist`, `matplotlib`, `plotly` |
| ML training | `senior-ml-engineer`, `pytorch-lightning` |
| ML evaluation | `evaluating-machine-learning-models`, `shap` |
| Scientific writing | `scientific-writing`, `scientific-schematics` |
| Debugging | `debugging-strategies`, `error-resolver` |
| Documentation | `docs-write`, `writing-docs` |
| Architecture | `architecture-patterns` |
| Bioinformatics | `biopython`, `bioservices`, `gget` |
| Drug discovery | `torchdrug`, `rdkit`, `uniprot-database` |

**Feature with Skill Planning:**

```json
{
  "id": "feat-001",
  "name": "Data Analysis Module",
  "recommended_skills": [
    {"skill": "exploratory-data-analysis", "phase": "implementation"},
    {"skill": "data-artist", "phase": "implementation"}
  ],
  "skill_dispatch_schedule": [
    {"step": 1, "action": "Explore data", "invoke_skill": "exploratory-data-analysis"},
    {"step": 2, "action": "Create charts", "invoke_skill": "data-artist"}
  ]
}
```

**Setup**: Place `Claude_Skills_中文指南.md` in workspace root. Skills will be auto-discovered and dispatched.

### MCP Auto-Integration & Human-like Computer Control

**⚠️ Enables browser automation, desktop control, and seamless tool invocation**

```markdown
ON SESSION START:

1. DISCOVER MCP servers
   - Run /mcp to list configured servers
   - Parse available tools from each server
   - Build capability map

2. CHECK critical capabilities:
   - browser_automation (puppeteer)
   - code_execution (ide)
   - desktop_control (desktop) - optional

3. AUTO-INSTALL missing servers if needed:
   - For web projects: puppeteer
   - For desktop apps: desktop
   - For database work: sqlite/postgres

4. UPDATE state.json → mcp_integration
```

**MCP Capability Matrix:**

| Capability | MCP Server | What It Enables |
|------------|------------|-----------------|
| Browser automation | puppeteer | Navigate, click, type, screenshot |
| Desktop control | desktop | Mouse, keyboard, screen capture |
| Code execution | ide | Run Python, get diagnostics |
| Database | sqlite/postgres | Query, insert, manage data |
| Web search | brave-search | Research, documentation lookup |
| HTTP requests | fetch | API testing, web fetching |

**Auto-Tool Selection:**

```
Task Pattern                    → MCP Tool
─────────────────────────────────────────────
"open website/url"              → mcp__puppeteer_navigate
"click button/element"          → mcp__puppeteer_click
"fill form/type text"           → mcp__puppeteer_type
"take screenshot"               → mcp__puppeteer_screenshot
"run JavaScript"                → mcp__puppeteer_evaluate
"control mouse"                 → mcp__desktop_mouse_move
"press key/hotkey"              → mcp__desktop_hotkey
"execute Python"                → mcp__ide__executeCode
```

**Example: Automated Web Testing**

```markdown
## E2E Test Flow (Automatic)

1. mcp__puppeteer_navigate → "https://myapp.com"
2. mcp__puppeteer_screenshot → capture initial state
3. mcp__puppeteer_fill → "#username", "testuser"
4. mcp__puppeteer_click → "#submit"
5. mcp__puppeteer_wait → ".dashboard"
6. mcp__puppeteer_evaluate → verify page state
7. mcp__puppeteer_screenshot → capture result
```

**Custom MCP Server Creation:**

When no existing MCP server fits the task, autonomous-builder can:
1. Identify requirement
2. Design custom MCP server
3. Write server code to `.builder/mcp-servers/`
4. Register with `claude mcp add`
5. Use immediately

### Auto-Restart & Self-Supervision

**⚠️ Enables true unattended long-running operation**

```markdown
ON PROJECT INITIALIZATION:
1. Create .builder/ directory
2. Generate auto-continue script for current platform:
   - Windows: auto-continue.ps1
   - Linux/macOS: auto-continue.sh
3. Create supervisor.json with monitoring config
4. Script runs in background, monitors session health
```

**Auto-Generated Supervisor Script:**

```bash
#!/bin/bash
# .builder/auto-continue.sh - Auto-generated by autonomous-builder

PROJECT_DIR="/path/to/project"
BUILDER_DIR="$PROJECT_DIR/.builder"
STATE_FILE="$BUILDER_DIR/state.json"
SUPERVISOR_CONFIG="$BUILDER_DIR/supervisor.json"

# Self-supervision loop
while true; do
    # Check if project is complete
    if [ -f "$STATE_FILE" ]; then
        STATUS=$(grep -o '"status"[[:space:]]*:[[:space:]]*"[^"]*"' "$STATE_FILE" | head -1 | cut -d'"' -f4)
        if [ "$STATUS" = "completed" ]; then
            echo "[$(date)] Project completed. Exiting supervisor."
            exit 0
        fi
    fi

    # Check last activity (if no activity for 5 min, restart)
    LAST_ACTIVITY=$(grep -o '"last_activity"[[:space:]]*:[[:space:]]*"[^"]*"' "$STATE_FILE" | cut -d'"' -f4)
    if [ -n "$LAST_ACTIVITY" ]; then
        # Parse and check timeout...
        # If timeout exceeded, trigger new session
    fi

    # Start/resume Claude session with permission bypass for unattended operation
    # WARNING: --dangerously-skip-permissions bypasses all user confirmations
    echo "[$(date)] Starting Claude session..."
    claude --skill autonomous-builder --project "$PROJECT_DIR" --dangerously-skip-permissions

    # Log session end
    echo "[$(date)] Session ended. Checking state..."

    # Wait before restart (configurable)
    sleep 5
done
```

**⚠️ Security Warning:** `--dangerously-skip-permissions` bypasses ALL user confirmations. Use only in trusted, isolated environments. Ensure workspace isolation and safety protocols are properly configured.

**Supervisor Configuration:**

```json
{
  "supervisor_version": "1.0",
  "project_path": "/path/to/project",
  "enabled": true,

  "monitoring": {
    "check_interval_seconds": 60,
    "session_timeout_seconds": 300,
    "max_restart_attempts": 10,
    "restart_cooldown_seconds": 5
  },

  "health_checks": {
    "progress_stall_threshold": 600,
    "error_rate_threshold": 0.5,
    "context_usage_warning": 0.8
  },

  "notifications": {
    "on_completion": true,
    "on_error_spike": true,
    "on_stall": true,
    "log_file": ".builder/supervisor.log"
  },

  "statistics": {
    "total_sessions": 0,
    "total_restarts": 0,
    "total_runtime_seconds": 0,
    "last_restart_time": null
  }
}
```

### Core Workflow Phases

| Phase | Actions | Output |
|-------|---------|--------|
| INITIALIZE | Check state, parse requirements | state.json, features.json |
| DESIGN | Detect tech stack, choose architecture | architecture.md |
| IMPLEMENT | Write code per feature | Source files |
| TEST | Run unit/integration/E2E | Test results |
| DEBUG | Apply 3-strike protocol | Fixes or escalation |
| DEPLOY | Build, document, archive | Final deliverables |

### State File Schema

```json
{
  "project_name": "string",
  "current_phase": "init|design|implement|test|deploy",
  "current_feature": "feature-id",
  "tech_stack": {
    "language": "string",
    "framework": "string",
    "runtime": "string"
  },
  "completed_features": ["feat-001"],
  "pending_features": ["feat-002"],
  "session_count": 0,
  "last_activity": "ISO-8601-timestamp"
}
```

### 3-Strike Error Recovery

```
STRIKE 1: Direct Fix
  - Analyze error type and root cause
  - Apply known solution pattern
  - Run tests to verify

STRIKE 2: Alternative Approach
  - Try different library/algorithm
  - Simplify implementation
  - Use different design pattern

STRIKE 3: Architecture Rethink
  - Question design assumptions
  - Research alternatives
  - Consider partial implementation

AFTER 3 STRIKES: Save checkpoint, request user guidance
```

### Loop Prevention (Anti-Infinite-Loop)

**⚠️ Critical: Prevents token waste in unattended operation**

```
DETECTION RULES:
┌─────────────────────────────────────────────────────────────────┐
│  Condition                    │ Threshold │ Action              │
├─────────────────────────────────────────────────────────────────┤
│  Same error repeated          │ 3 times   │ ESCALATE immediately│
│  Same file modified           │ 5 times   │ STOP, review approach│
│  Same command executed        │ 3 times   │ Try alternative     │
│  No progress in N operations  │ 10 ops    │ PAUSE, reassess     │
│  Single session too long      │ 50 turns  │ Checkpoint & pause  │
└─────────────────────────────────────────────────────────────────┘
```

**Loop Detection Algorithm:**

```python
class LoopDetector:
    MAX_SAME_ERROR = 3        # Same error appears 3 times
    MAX_SAME_FILE_EDIT = 5    # Same file edited 5 times
    MAX_SAME_COMMAND = 3      # Same command run 3 times
    MAX_NO_PROGRESS = 10      # No feature completed in 10 ops
    MAX_SESSION_TURNS = 50    # Maximum turns per session

    def check_loop(self, state):
        # Check 1: Same error repeating
        if self.count_same_error(state.errors) >= self.MAX_SAME_ERROR:
            return LoopAlert("SAME_ERROR_LOOP", "Escalate to user")

        # Check 2: Same file being edited repeatedly
        if self.count_same_file_edits(state.recent_edits) >= self.MAX_SAME_FILE_EDIT:
            return LoopAlert("FILE_EDIT_LOOP", "Review approach")

        # Check 3: Same command executing repeatedly
        if self.count_same_commands(state.recent_commands) >= self.MAX_SAME_COMMAND:
            return LoopAlert("COMMAND_LOOP", "Try alternative")

        # Check 4: No progress indicator
        if self.count_operations_without_progress(state) >= self.MAX_NO_PROGRESS:
            return LoopAlert("NO_PROGRESS", "Reassess strategy")

        # Check 5: Session too long
        if state.session_turns >= self.MAX_SESSION_TURNS:
            return LoopAlert("SESSION_LIMIT", "Create checkpoint and pause")

        return None  # No loop detected
```

**When Loop Detected - Escalation Protocol:**

```markdown
## LOOP ALERT: [Type]

**Detected Pattern**: [What repeated]
**Occurrences**: [Count] times
**Time Spent**: [Duration]
**Token Estimate**: [Approximate tokens used]

**Actions Taken**:
1. Stopped current operation
2. Saved checkpoint to .builder/checkpoints/
3. Logged loop pattern to .builder/loop-log.json

**Status**: PAUSED - Awaiting user input

**Options**:
A) Skip this feature and continue with next
B) Accept partial implementation
C) Provide additional context/guidance
D) Abort and generate report
```

**Loop State Tracking:**

```json
{
  "loop_detection": {
    "error_history": [
      {"error_hash": "abc123", "count": 2, "first_seen": "...", "last_seen": "..."}
    ],
    "file_edit_history": [
      {"file": "src/app.py", "edit_count": 3, "last_edit": "..."}
    ],
    "command_history": [
      {"command": "npm test", "run_count": 2, "last_run": "..."}
    ],
    "progress_check": {
      "operations_since_last_feature": 5,
      "last_completed_feature": "feat-002",
      "last_completion_time": "..."
    },
    "session_metrics": {
      "start_time": "...",
      "turn_count": 25,
      "tokens_estimated": 50000
    }
  }
}
```

**Mandatory Break Points:**

```
After every 20 operations:
  └─ Check progress: Did any feature advance?
      ├─ YES: Continue
      └─ NO: Pause and reassess

After every 10 minutes:
  └─ Review: Are we making meaningful progress?
      ├─ YES: Continue
      └─ NO: Checkpoint and evaluate

On same error 2nd occurrence:
  └─ Warning: Same error detected, trying different approach
  └─ Log: Record pattern for analysis

On same error 3rd occurrence:
  └─ STOP: Loop detected, escalate to user
  └─ Save: Create checkpoint before pause
```

### File Writing Strategy

For files > 500 lines, write in segments:
```python
SEGMENT_SIZE = 200  # lines per segment

# First segment: create file
write_file(path, first_segment)

# Subsequent segments: append
edit_file(path, append=next_segment)
```

### Technology Stack Detection

```python
def detect_tech_stack(project_path):
    indicators = {
        'python': ['requirements.txt', 'pyproject.toml', '*.py'],
        'nodejs': ['package.json', '*.ts', '*.js'],
        'rust': ['Cargo.toml', '*.rs'],
        'go': ['go.mod', '*.go'],
    }
    # Auto-detect and return primary stack
```

## Rules & Constraints

### MUST (Non-negotiable)

- Create `.builder/` directory before any work
- Update `state.json` after EVERY tool operation
- Log ALL errors to `errors.json` with resolution attempts
- Commit checkpoint after each feature completion
- Use segmented writes for files > 500 lines
- Run tests before marking feature complete

### SHOULD (Strong recommendations)

- Follow existing project conventions
- Use conventional commit messages
- Create meaningful tests (not just coverage)
- Document non-obvious decisions in `architecture.md`
- Prefer simpler solutions over clever ones

### NEVER (Explicit prohibitions)

- Delete user files without explicit permission
- Overwrite existing code without backup
- Commit secrets or credentials
- Skip error handling
- Make network calls without timeout
- Create infinite loops without escape conditions

### SAFETY CRITICAL (System Protection - HIGHEST PRIORITY)

**⚠️ These rules take precedence over ALL other operations. When in doubt, STOP and ASK.**

**Operations requiring explicit user confirmation:**

| Operation Type | Examples | Required Action |
|---------------|----------|-----------------|
| Files outside workspace | `C:\Windows\`, `/etc/`, `/usr/bin/` | STOP, warn user, get explicit approval |
| System configuration | Registry edits, `/etc/hosts`, environment variables | STOP, explain risk, get approval |
| Destructive operations | `rm -rf`, `format`, `DROP DATABASE` | STOP, show impact, get approval |
| Network/firewall changes | Port binding, firewall rules | STOP, explain scope, get approval |
| Package installation | `npm install -g`, `pip install --system` | Warn about system-wide changes |

**Pre-execution safety checks:**

```markdown
Before ANY operation, verify:

1. IS TARGET INSIDE WORKSPACE?
   ✅ Path starts with project root -> Proceed
   ⚠️ Path outside workspace -> STOP and confirm

2. IS OPERATION DESTRUCTIVE?
   ✅ Read/Write/Create in workspace -> Proceed
   ⚠️ Delete/Format/Truncate -> STOP and confirm

3. IS OPERATION SYSTEM-WIDE?
   ✅ Project-local operation -> Proceed
   ⚠️ Global install/System config -> STOP and confirm

4. COULD DATA BE LOST?
   ✅ New file creation -> Proceed
   ⚠️ Overwrite/Delete existing -> STOP and backup first
```

**Protected paths (NEVER modify without explicit approval):**

```
System directories:
- Windows: C:\Windows\, C:\Program Files\, C:\Program Files (x86)\
- Linux: /etc/, /usr/, /var/, /root/, /home/ (other users)
- macOS: /System/, /Library/, /Applications/

User data outside workspace:
- Desktop, Documents, Downloads (outside project)
- Any path containing "backup", "archive", "important"
- Database files not in project directory
- Configuration files: .bashrc, .zshrc, .gitconfig (global)
```

**Safe operation protocol:**

```
IF operation touches files outside workspace:
  1. STOP execution immediately
  2. Display warning to user:
     "⚠️ SAFETY ALERT: This operation affects files outside the workspace"
     - Target path: [full path]
     - Operation type: [read/write/delete]
     - Potential impact: [description]
  3. Ask for explicit confirmation:
     "Do you want to proceed? This action cannot be undone."
  4. If user declines -> Abort and suggest alternatives
  5. If user approves -> Log the approval and proceed cautiously

IF operation could cause data loss:
  1. Create backup before proceeding
  2. Log the operation to .builder/safety-log.json
  3. Provide rollback instructions
```

**Data safety principles:**

1. **Preserve user data** - Never delete/overwrite without explicit consent
2. **Backup before destructive ops** - Create .backup/ if needed
3. **Workspace isolation** - All operations confined to project directory
4. **Fail-safe defaults** - When uncertain, choose the safer option
5. **Audit trail** - Log all potentially dangerous operations

## MCP Integration

### Puppeteer (Web Testing)

```markdown
## E2E Test Pattern
1. Launch browser: mcp__puppeteer_navigate
2. Interact: mcp__puppeteer_click, mcp__puppeteer_type
3. Verify: mcp__puppeteer_evaluate, mcp__puppeteer_screenshot
4. Cleanup: mcp__puppeteer_close
```

### IDE Tools (Code Execution)

```markdown
## Code Execution Pattern
1. Write code to file
2. Execute: mcp__ide__executeCode
3. Check diagnostics: mcp__ide__getDiagnostics
4. Fix errors and retry
```

## Workflow Reporting

### Overview

Autonomous-builder now generates comprehensive workflow reports that document the entire development process, including user prompts, decisions, errors, and solutions.

**Features**:
- Automatic workflow logging during feature implementation
- Unified report template compatible with commit-with-reflection
- Detailed recording of user prompts and AI decisions
- Integration with knowledge-steward for experience extraction
- Pure Chinese reports for better readability

### Configuration

**Project-level configuration** (`.claude-workflows.yaml`):
```yaml
version: "1.0"
enabled: true

reporting:
  language: "zh-CN"
  detail_level: "detailed"
  output_dir: "docs/workflows"

skills:
  autonomous-builder:
    workflow_reporting: true
```

**Builder-level configuration** (`.builder/config.yaml`):
```yaml
workflow_reporting:
  enabled: true
  use_unified_template: true
  language: "zh-CN"
  detail_level: "detailed"
  record_all_tools: true
  record_decisions: true
```

### Workflow Log Structure

During feature implementation, autonomous-builder maintains a detailed log in `.builder/workflow-log.json`:

```json
{
  "session_id": "session-2026-02-15-001",
  "feature_id": "feat-003",
  "start_time": "2026-02-15T14:00:00Z",
  "end_time": "2026-02-15T14:45:00Z",
  "user_prompts": [
    {
      "timestamp": "2026-02-15T14:00:00Z",
      "prompt": "实现用户认证功能",
      "context": "用户希望添加JWT token验证"
    }
  ],
  "workflow_steps": [
    {
      "step": 1,
      "action": "分析需求",
      "tool": "Read",
      "files": ["server/auth.ts"],
      "duration_seconds": 120
    }
  ],
  "decisions": [
    {
      "point": "选择认证方案",
      "options": ["JWT", "Session", "OAuth"],
      "chosen": "JWT",
      "reason": "无状态，适合API"
    }
  ],
  "errors": [
    {
      "type": "TypeError",
      "message": "Cannot read property 'userId'",
      "solution": "更新User接口定义",
      "attempts": 2
    }
  ]
}
```

### Report Generation (Step 8)

After completing feature implementation and testing, autonomous-builder generates a workflow report:

1. **Read workflow log**: Load `.builder/workflow-log.json`
2. **Load template**: Use unified template from `docs/workflows/templates/unified-template.md`
3. **Fill template**: Populate all 12 sections with session data
4. **Save report**: Write to `docs/workflows/YYYY-MM/DD_workflow_[category]_[desc].md`
5. **Update index**: Regenerate `docs/workflows/INDEX.md`

### Report Structure

The generated report includes 12 sections:

1. **概述** - Summary of the work
2. **用户需求与提示词** - User requirements and key prompts
3. **工作流记录** - Detailed workflow steps, decisions, and tools used
4. **修改内容** - Files modified and main changes
5. **遇到的错误** - Errors encountered with details
6. **根本原因分析** - Root cause analysis
7. **调试过程** - Debugging steps and iterations
8. **经验总结** - Key insights and prevention strategies
9. **知识提炼** - Reusable patterns and anti-patterns
10. **测试与验证** - Test cases and verification steps
11. **参考资料** - Related documentation and resources
12. **指标** - Metrics (errors, iterations, success rate, etc.)

### Updated Commit Message Format

Commits now reference the workflow report:

```
feat: 实现用户认证功能

添加了JWT token验证和用户登录API端点。

工作流步骤: 8
决策点: 3
遇到错误: 2
调试迭代: 4

详见工作流报告: docs/workflows/2026-02/15_workflow_feature_user-auth.md

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
```

### Integration with knowledge-steward

Workflow reports can be analyzed by knowledge-steward to:
- Extract effective prompts and interaction patterns
- Identify reusable architectural patterns
- Build a knowledge base of common errors and solutions
- Generate experience summaries and best practices

See `references/workflow-recording.md` for detailed implementation guide.

## GitHub Integration

### Overview

Autonomous-builder integrates with GitHub for remote repository management, issue tracking, and release automation.

**Features**:
- Automatic push after each feature completion
- GitHub Issues tracking for features
- Release tags at milestones (25%, 50%, 75%, 100%)
- Version rollback support via GitHub history

### Prerequisites

**GitHub CLI (gh)**:
```bash
# Windows
winget install GitHub.cli

# macOS
brew install gh

# Linux
sudo apt install gh
```

**Authentication**:
```bash
gh auth login
gh auth status  # Verify
```

### Workflow Integration

**Initializer Agent (Session 1)**:
1. Prompt for GitHub repository URL (optional)
2. Verify `gh auth status`
3. Set up remote: `git remote add origin <url>`
4. Create README.md and PLANNING.md
5. Initial commit and push to GitHub
6. Create GitHub issues for all features

**Builder Agent (Sessions 2+)**:
1. Implement feature
2. Commit with issue reference: `Closes #N`
3. Push to GitHub: `git push origin main`
4. Update GitHub issue (auto-closed via commit)
5. Check milestone and create release tag if needed

### Commit Message Format

```
feat: 实现用户认证功能

添加了JWT token验证和用户登录API端点。

工作流步骤: 8
决策点: 3
遇到错误: 2
调试迭代: 4

详见工作流报告: docs/workflows/2026-02/15_workflow_feature_user-auth.md

Closes #123

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
```

### Release Tags

Automatic tags created at milestones:
- **25% completion**: v0.1.0 (Foundation)
- **50% completion**: v0.2.0 (Core Features)
- **75% completion**: v0.3.0 (Advanced Features)
- **100% completion**: v1.0.0 (Release)

### Error Handling

- **Network failures**: 3 retries with 5s delay, then queue for next session
- **Auth failures**: Disable GitHub integration, continue with local commits
- **Push conflicts**: Auto-pull with rebase and retry

### Disabling GitHub

Leave repository URL empty during initialization, or set `state.json → github.enabled = false`.

### Rollback

```bash
# Rollback to previous feature
git log --oneline
git reset --hard <commit_hash>
git push --force origin main
gh issue reopen <issue_number>

# Rollback to release tag
git checkout v0.1.0
git checkout -b rollback-to-v0.1.0
```

**See**: `references/github-integration.md` for comprehensive documentation.

## Examples

### Example 1: New Project Creation

**Input**: "Build a REST API for task management with Python FastAPI"

**Steps**:
1. Initialize `.builder/` with state.json
2. Analyze requirements -> Generate features.json:
   ```json
   {
     "features": [
       {"id": "feat-001", "name": "Project Setup", "status": "pending"},
       {"id": "feat-002", "name": "Database Models", "status": "pending"},
       {"id": "feat-003", "name": "CRUD Endpoints", "status": "pending"},
       {"id": "feat-004", "name": "Authentication", "status": "pending"},
       {"id": "feat-005", "name": "API Tests", "status": "pending"}
     ]
   }
   ```
3. Create architecture.md with FastAPI patterns
4. Implement feature by feature
5. Test each feature before moving to next
6. Generate final documentation

### Example 2: Resume Interrupted Project

**Input**: User starts new session, `.builder/state.json` exists

**Steps**:
1. Read state.json -> Get current phase and feature
2. Read features.json -> Get feature status
3. Resume from last checkpoint
4. Continue implementation

### Example 3: Bug Fix Request

**Input**: "Fix the authentication bug in my FastAPI app"

**Steps**:
1. Detect existing project structure
2. Read relevant code files
3. Identify bug using error-resolver patterns
4. Apply fix with 3-strike protocol
5. Run tests to verify fix
6. Update state and commit

## References

### Official Architecture Patterns (Anthropic claude-quickstarts)

- `references/two-agent-architecture.md`: **CRITICAL** - Two-Agent pattern for long-running tasks, fresh context per session
- `references/think-tool.md`: **CRITICAL** - Think Tool for complex reasoning before action
- `references/multi-layer-security.md`: **CRITICAL** - Defense in depth security architecture

### Core Capabilities

- `references/safety-protocols.md`: **CRITICAL** - System protection and safe operation protocols
- `references/loop-prevention.md`: **CRITICAL** - Anti-infinite-loop detection and token management
- `references/session-continuity.md`: **CRITICAL** - Auto-resume and continuous operation across sessions
- `references/skill-scheduling.md`: **CRITICAL** - Automatic skill discovery, planning, and dispatch
- `references/mcp-auto-integration.md`: **CRITICAL** - MCP auto-discovery, installation, and human-like computer control
- `references/github-integration.md`: **NEW** - GitHub integration for remote push, issue tracking, and release automation

### Implementation Guides

- `references/index.md`: Navigation for all reference docs
- `references/architecture-patterns.md`: Clean Architecture, Hexagonal, DDD
- `references/multi-language.md`: Language-specific patterns (Python, Node.js, Go, Rust)
- `references/error-recovery.md`: Detailed error handling strategies
- `references/mcp-integration.md`: MCP tool usage guide
- `references/testing-patterns.md`: Unit, integration, E2E testing

## Plugin 智能发现与自动使用 (ToolSearch Auto-Discovery)

### 核心原则

autonomous-builder 在执行任务时，**必须主动使用 ToolSearch** 动态发现并调用可用的 MCP 插件工具。这是对现有 MCP Auto-Integration 的升级，从静态配置变为运行时动态发现。

### 会话启动时自动发现

```
ON SESSION START (Step 0 - 在 Step 1 之前执行):

1. 使用 ToolSearch 探测所有可用插件:
   - ToolSearch("+playwright") → 浏览器自动化工具
   - ToolSearch("+github") → GitHub 操作工具
   - ToolSearch("+serena") → 代码语义分析工具
   - ToolSearch("context7") → 文档查询工具
   - ToolSearch("getDiagnostics") → IDE 诊断工具
   - ToolSearch("executeCode") → 代码执行工具

2. 构建能力矩阵并存入 .builder/state.json:
   {
     "discovered_plugins": {
       "playwright": true/false,
       "github_mcp": true/false,
       "serena": true/false,
       "context7": true/false,
       "ide_diagnostics": true/false,
       "ide_execute": true/false
     },
     "last_discovery": "ISO-8601-timestamp"
   }

3. 根据发现的插件调整工作流策略
```

### 各步骤插件智能调用

| Builder Step | ToolSearch 查询 | 用途 |
|-------------|----------------|------|
| Step 1: Get Context | `ToolSearch("+serena get_symbols_overview")` | 语义级代码结构分析，比 ls/grep 更精确 |
| Step 2: Start Server | `ToolSearch("+playwright navigate")` | 用 Playwright 代替 Puppeteer 验证服务 |
| Step 3: Regression Check | `ToolSearch("getDiagnostics")` | IDE 诊断检查类型错误和 lint 问题 |
| Step 4: Select Feature | `ToolSearch("context7")` | 查询相关库文档辅助实现决策 |
| Step 5: Implement | `ToolSearch("+serena find_symbol")` | 精确定位需要修改的代码符号 |
| Step 5: Implement | `ToolSearch("+serena replace_symbol_body")` | 语义级代码编辑 |
| Step 6: Browser Test | `ToolSearch("+playwright snapshot")` | 获取页面快照进行 UI 验证 |
| Step 6: Browser Test | `ToolSearch("+playwright click")` | 模拟用户交互 |
| Step 7: Update Status | `ToolSearch("+github update_issue")` | 更新 GitHub Issue 状态 |
| Step 8: Report | `ToolSearch("+github create_or_update_file")` | 直接推送报告到 GitHub |
| Step 9: Git Push | `ToolSearch("+github push_files")` | 通过 MCP 推送代码 |

### 实现阶段的智能插件选择

```
DURING FEATURE IMPLEMENTATION:

1. 代码分析阶段:
   IF serena 可用:
     → ToolSearch("+serena find_symbol") 定位目标符号
     → ToolSearch("+serena find_referencing_symbols") 分析影响范围
     → ToolSearch("+serena get_symbols_overview") 理解文件结构
   ELSE:
     → 回退到 Grep + Read 方式

2. 代码编辑阶段:
   IF serena 可用:
     → ToolSearch("+serena replace_symbol_body") 精确替换符号
     → ToolSearch("+serena insert_after_symbol") 插入新代码
   ELSE:
     → 回退到 Edit 工具

3. 测试阶段:
   IF playwright 可用:
     → ToolSearch("+playwright navigate") 打开应用
     → ToolSearch("+playwright snapshot") 获取页面状态
     → ToolSearch("+playwright click") 模拟交互
     → ToolSearch("+playwright browser_evaluate") 执行 JS 验证
   ELSE IF puppeteer 可用:
     → 使用 puppeteer MCP 工具
   ELSE:
     → 回退到 Bash 执行测试命令

4. 文档查询阶段:
   IF context7 可用:
     → ToolSearch("context7") 查询库文档
     → 获取最新 API 用法和最佳实践
   ELSE:
     → 使用 WebSearch/WebFetch

5. 代码质量检查:
   IF ide_diagnostics 可用:
     → ToolSearch("getDiagnostics") 获取诊断
     → 在提交前修复所有错误和警告
   ELSE:
     → 使用 Bash 运行 linter/type-checker
```

### 与现有 MCP Auto-Integration 的关系

```
旧方式 (静态):
  ON SESSION START → 运行 /mcp → 解析工具列表 → 硬编码工具名

新方式 (动态 ToolSearch):
  ON NEED → ToolSearch(关键词) → 发现工具 → 立即使用

优势:
  - 无需预先知道工具名称
  - 自动适应不同环境的插件配置
  - 按需加载，减少上下文占用
  - 关键词搜索比精确名称更灵活
```

### 注意事项

- ToolSearch 返回的工具**立即可用**，无需再次 select
- 关键词搜索已加载工具后，**不要**重复用 `select:` 加载
- 优先使用 MCP 工具而非 Bash 命令
- 如果 ToolSearch 未找到相关工具，回退到原有方式
- 将插件发现结果缓存到 state.json，避免重复探测
- 每个新会话重新探测一次（插件配置可能变化）

## Maintenance

- Sources: Anthropic agent patterns, claude-skills best practices
- Last updated: 2026-02-16
- Version: 2.0 (添加 ToolSearch 插件智能发现)
- Known limits: Cannot handle hardware-dependent code, GPU computing without setup

## Quality Gate

Before marking project complete:

1. [ ] All features in features.json have status "complete"
2. [ ] All tests pass (check features.json test counts)
3. [ ] No uncommitted changes
4. [ ] Documentation generated
5. [ ] State archived to `.builder/archive/`
Related Skills

zinc-database

1174
from foryourhealth111-pixel/Vibe-Skills
Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.
zarr-python

1174
from foryourhealth111-pixel/Vibe-Skills
Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.
yeet

1174
from foryourhealth111-pixel/Vibe-Skills
Use only when the user explicitly asks to stage, commit, push, and open a GitHub pull request in one flow using the GitHub CLI (`gh`).
xlsx

1174
from foryourhealth111-pixel/Vibe-Skills
Spreadsheet toolkit (.xlsx/.csv). Create/edit with formulas/formatting, analyze data, visualization, recalculate formulas, for spreadsheet processing and analysis.
xan

1174
from foryourhealth111-pixel/Vibe-Skills
High-performance CSV processing with xan CLI for large tabular datasets, streaming transformations, and low-memory pipelines.
writing-plans

1174
from foryourhealth111-pixel/Vibe-Skills
Use when you have a spec or requirements for a multi-step task, before touching code
writing-docs

1174
from foryourhealth111-pixel/Vibe-Skills
Guides for writing and editing Remotion documentation. Use when adding docs pages, editing MDX files in packages/docs, or writing documentation content.
windows-hook-debugging

1174
from foryourhealth111-pixel/Vibe-Skills
Windows环境下Claude Code插件Hook执行错误的诊断与修复。当遇到hook error、cannot execute binary file、.sh regex误匹配、WSL/Git Bash冲突时使用。
weights-and-biases

1174
from foryourhealth111-pixel/Vibe-Skills
Track ML experiments with automatic logging, visualize training in real-time, optimize hyperparameters with sweeps, and manage model registry with W&B - collaborative MLOps platform
webthinker-deep-research

1174
from foryourhealth111-pixel/Vibe-Skills
Deep web research for VCO: multi-hop search+browse+extract with an auditable action trace and a structured report (WebThinker-style).
vscode-release-notes-writer

1174
from foryourhealth111-pixel/Vibe-Skills
Guidelines for writing and reviewing Insiders and Stable release notes for Visual Studio Code.
visualization-best-practices

1174
from foryourhealth111-pixel/Vibe-Skills
Visualization Best Practices - Auto-activating skill for Data Analytics. Triggers on: visualization best practices, visualization best practices Part of the Data Analytics skill category.