issue-garbage-collector

Two-phase cleanup of duplicate and outdated issue files in docs/issues/. Phase 1 uses Python script for fast pattern matching. Phase 2 uses claude -p for semantic analysis on suspects only.

465 stars

byphodal

View on GitHub Installation ↓

Best use case

issue-garbage-collector is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Two-phase cleanup of duplicate and outdated issue files in docs/issues/. Phase 1 uses Python script for fast pattern matching. Phase 2 uses claude -p for semantic analysis on suspects only.

Teams using issue-garbage-collector should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/issue-garbage-collector/SKILL.md --create-dirs "https://raw.githubusercontent.com/phodal/routa/main/.claude/skills/issue-garbage-collector/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/issue-garbage-collector/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How issue-garbage-collector Compares

Feature / Agent	issue-garbage-collector	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Two-phase cleanup of duplicate and outdated issue files in docs/issues/. Phase 1 uses Python script for fast pattern matching. Phase 2 uses claude -p for semantic analysis on suspects only.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

## Quick Start

```bash
# Phase 1: Run Python scanner (fast, free)
python3 .github/scripts/issue-scanner.py

# Phase 1: Get suspects only (for Phase 2 input)
python3 .github/scripts/issue-scanner.py --suspects-only

# Phase 1: JSON output (for automation)
python3 .github/scripts/issue-scanner.py --json

# Phase 1: Validation check (CI integration, exit 1 if errors)
python3 .github/scripts/issue-scanner.py --check
```

---

## Harness Integration

- Repo-defined entry: `docs/harness/automations.yml` contains `issue-gc-review`
- Harness surface: `settings/harness` → `Cleanup & Correction`
- Data source: the Harness automation view reads suspect data from `python3 .github/scripts/issue-scanner.py --suspects-only`
- Intended usage: review pending duplicate / stale / open-check suspects in Harness first, then decide whether to run the cleanup workflow below

---

## Two-Phase Strategy (Cost Optimization)

**Problem**: Running deep AI analysis on every issue is expensive.

**Solution**: Two-phase approach:
1. **Phase 1 (Fast/Free)** — Python script for pattern matching
2. **Phase 2 (Deep/Expensive)** — `claude -p` only on suspects

```
┌─────────────────────────────────────────────────────────┐
│  All Issues (N files)                                   │
│  ┌───────────────────────────────────────────────────┐  │
│  │ Phase 1: Python Scanner (.github/scripts/issue-scanner.py)│ │
│  │ - Filename keyword extraction                     │  │
│  │ - YAML front-matter validation                    │  │
│  │ - Same area + keyword overlap detection           │  │
│  │ - Age-based staleness check                       │  │
│  │ → Output: Suspect list (M files, M << N)          │  │
│  └───────────────────────────────────────────────────┘  │
│                         ↓                               │
│  ┌───────────────────────────────────────────────────┐  │
│  │ Phase 2: Deep Analysis (claude -p, only M files)  │  │
│  │ - Content similarity                              │  │
│  │ - Semantic duplicate detection                    │  │
│  │ - Merge recommendations                           │  │
│  └───────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────┘
```

---

## Phase 1: Python Scanner

Run `python3 .github/scripts/issue-scanner.py` to get:

### 1.1 Formatted Table View

```
====================================================================================================
📋 ISSUE SCANNER REPORT
====================================================================================================

📊 ISSUE TABLE:
----------------------------------------------------------------------------------------------------
Status       Sev  Date         Area               Title
----------------------------------------------------------------------------------------------------
✅ resolv     🟠    2026-03-02   background-worker  HMR 导致 sessionToTask 内存 Map 丢失
🔴 open       🟡    2026-03-04   ui                 Task Execute button disabled
...
----------------------------------------------------------------------------------------------------
Total: 12 issues

📈 SUMMARY BY STATUS:
  🔴 open: 5
  ✅ resolved: 7
```

### 1.2 Validation Errors

If any issue has malformed front-matter, the scanner reports:

```
❌ VALIDATION ERRORS (need AI fix):
------------------------------------------------------------
  2026-03-08-broken-issue.md:
    - Missing required field: area
    - Invalid status: pending (valid: ['open', 'investigating', 'resolved', 'wontfix', 'duplicate'])
```

**Action**: Ask AI to fix the file:
```bash
claude -p "Fix the front-matter in docs/issues/2026-03-08-broken-issue.md. Add missing 'area' field and change status to a valid value."
```

### 1.3 Suspect Detection

The scanner automatically detects:

| Type | Detection Rule | Example |
|------|----------------|---------|
| **Duplicate** | Same area + ≥2 common keywords | `hmr-task` vs `task-hmr-recovery` |
| **Stale** | `open` > 30 days | Issue from 2026-01-15 still open |
| **Stale** | `investigating` > 14 days | Stuck investigation |

Output:
```
⚠️  SUSPECTS (need Phase 2 deep analysis):
------------------------------------------------------------

  🔗 Potential Duplicates:
    - 2026-03-02-hmr-resets-session-to-task-map.md
      ↔ 2026-03-08-background-task-hmr-recovery.md
      Reason: Same area 'background-worker', keywords: {'task', 'hmr'}

  ⏰ Stale Issues:
    - 2026-02-01-old-bug.md: Open for 35 days (>30)
```

### 1.4 JSON Output for Automation

```bash
# Get suspects as JSON for scripting
python3 .github/scripts/issue-scanner.py --suspects-only
```

Output:
```json
[
  {
    "file_a": "2026-03-02-hmr-resets-session-to-task-map.md",
    "file_b": "2026-03-08-background-task-hmr-recovery.md",
    "reason": "Same area 'background-worker', keywords: {'task', 'hmr'}",
    "type": "duplicate"
  }
]
```

---

## Phase 2: Automatic Deep Analysis

**IMPORTANT**: After Phase 1, proceed automatically to Phase 2 without asking. Do NOT ask "Would you like me to proceed?" — just do it.

### Execution Flow

1. Run `python3 .github/scripts/issue-scanner.py`
2. For each suspect found, **automatically** perform deep analysis
3. For each action needed, **execute immediately** (update status, merge, etc.)
4. Only ask for confirmation on **destructive actions** (delete, merge)
5. Report final summary when done

### For Each Suspect Type

**Duplicates** — Read both files, compare content:
- If same root cause → Merge (keep newer, add context from older)
- If related but different → Add `related_issues` cross-reference
- If distinct → Skip (false positive)

**Open Issues** — Check if resolved:
- Read the issue, check `Relevant Files` in codebase
- If code shows fix → `python3 .github/scripts/issue-scanner.py --resolve <file>`
- If still broken → Leave as open
- If unclear → Leave as open, add comment in issue

**Stale Issues** (open > 30 days):
- Check if code still exists
- If fixed → Resolve
- If code removed → Close with `--close`
- If still relevant → Create GitHub issue for tracking

### Quick Update Commands

Use the scanner's update commands for fast changes:

```bash
# Resolve issues (status: open → resolved)
python3 .github/scripts/issue-scanner.py --resolve file1.md file2.md

# Close issues (status: open → wontfix)
python3 .github/scripts/issue-scanner.py --close file.md

# Generic field update
python3 .github/scripts/issue-scanner.py --set severity high --files file.md
```

### Safety Rules

1. **Never delete `_template.md`**
2. **Never delete issues with `status: investigating`** — active work
3. **Ask for confirmation** only for: delete, merge
4. **Auto-execute** for: status updates, adding cross-references
5. **Preserve knowledge** — resolved issues are valuable

---

## Periodic Maintenance

| Frequency | Action |
|-----------|--------|
| After adding issues | Run `python3 .github/scripts/issue-scanner.py` |
| Weekly (active dev) | Full scan + Phase 2 on suspects |
| Monthly (stable) | Full scan + triage all open issues |

---

## Cost Optimization

| Approach | Deep Analysis | Cost |
|----------|---------------|------|
| Naive (all) | N files | 💰💰💰💰💰 |
| Two-phase | ~M suspects (M << N) | 💰 |

**Savings**: ~90% cost reduction by filtering in Phase 1.

Related Skills

issue-enricher

465

from phodal/routa

Transforms rough requirements into well-structured GitHub issues. Use when the user provides a vague idea, feature request, or problem description and wants to create a GitHub issue. Analyzes codebase, explores solution approaches, researches relevant libraries, and generates actionable issues using `gh` CLI.

spreadsheets

465

from phodal/routa

Use this skill for spreadsheet creation, editing, analysis, formatting, formula modeling, charting, or workbook review. Triggers include requests to create or modify an .xlsx file, build a model or tracker, format a workbook, add formulas or charts, or prepare a shareable spreadsheet deliverable.

slide

465

from phodal/routa

Use this skill as reference material when creating or editing presentation slide decks.

pdf

465

from phodal/routa

Use this skill for PDF generation, conversion, inspection, extraction, editing, form filling, OCR, redaction, or render comparison. Triggers include requests to create a PDF, convert Markdown or HTML or LaTeX or DOCX or PPTX to PDF, extract text or tables or images, fill or inspect forms, OCR scans, compare revisions, or redact content.

docx

465

from phodal/routa

Use this skill for creating, editing, and reviewing DOCX files, including generation, formatting, content controls, tracked changes, comments, accessibility checks, redaction, rendering, and diff-based QA workflows.

pr-verify

465

from phodal/routa

Comprehensive PR verification skill. Analyzes PR body requirements, reviews comments, checks CI status, and performs E2E testing. Use when a PR is ready for final verification before merge.

playwright-cli

465

from phodal/routa

Automates browser interactions for web testing, form filling, screenshots, and data extraction. Use when the user needs to navigate websites, interact with web pages, fill forms, take screenshots, test web applications, or extract information from web pages.

evolution-architecture-review

465

from phodal/routa

Multi-agent architecture evolvability review for this repository. Use when the user wants to analyze current architecture quality, evolvability, fitness functions, coupling, boundary clarity, delivery flow, or phased evolution strategy. Designed to be invoked from Claude Code with prompts like `/evolution-architecture-review analyze the current architecture evolvability`.

slack

465

from phodal/routa

Interact with Slack workspaces using browser automation. Use when the user needs to check unread channels, navigate Slack, send messages, extract data, find information, search conversations, or automate any Slack task. Triggers include "check my Slack", "what channels have unreads", "send a message to", "search Slack for", "extract from Slack", "find who said", or any task requiring programmatic Slack interaction.

electron

465

from phodal/routa

Automate Electron desktop apps (VS Code, Slack, Discord, Figma, Notion, Spotify, etc.) using agent-browser via Chrome DevTools Protocol. Use when the user needs to interact with an Electron app, automate a desktop app, connect to a running app, control a native app, or test an Electron application. Triggers include "automate Slack app", "control VS Code", "interact with Discord app", "test this Electron app", "connect to desktop app", or any task requiring automation of a native Electron application.

dogfood

465

from phodal/routa

Systematically explore and test a web application to find bugs, UX issues, and other problems. Use when asked to "dogfood", "QA", "exploratory test", "find issues", "bug hunt", "test this app/site/platform", or review the quality of a web application. Produces a structured report with full reproduction evidence -- step-by-step screenshots, repro videos, and detailed repro steps for every issue -- so findings can be handed directly to the responsible teams.

agent-browser

465

from phodal/routa

Browser automation CLI for AI agents. Use when the user needs to interact with websites, including navigating pages, filling forms, clicking buttons, taking screenshots, extracting data, testing web apps, or automating any browser task. Triggers include requests to "open a website", "fill out a form", "click a button", "take a screenshot", "scrape data from a page", "test this web app", "login to a site", "automate browser actions", or any task requiring programmatic web interaction.