About this skill
This skill empowers an AI agent to efficiently manage its own or a team's historical analytical pipeline executions. Each "run" is treated as a self-contained directory, typically located under `working/runs/`, encapsulating its specific working files, outputs, and pipeline state. The agent can list all previous runs, view detailed information for the latest or a specific run by ID, perform side-by-side comparisons of two runs, and intelligently clean up older runs to manage storage. Users leverage this skill to gain profound insights into their data analysis workflow. It helps in quickly reviewing what analyses have been performed, tracking the status of past computations (completed, failed, running), and understanding the evolution of a project. For instance, a data analyst could use it to revisit a previous quarterly revenue analysis or debug why a certain activation funnel deep-dive failed. By providing structured access to historical execution data, this skill enhances efficiency and reproducibility. It allows for quick auditing of past work, facilitates debugging by examining specific run states, and aids in resource management through automated cleanup. This capability is crucial for maintaining a clean and organized working environment for complex, iterative analytical tasks.
Best use case
The primary use case is for AI agents and human users to effectively manage and audit the historical execution of analytical or computational pipelines. This benefits data scientists, AI developers, and researchers who need a structured way to track, inspect, compare, and clean up previous attempts or successful runs, ensuring accountability, reproducibility, and efficient resource utilization within their projects.
## Purpose
A user can expect to receive structured lists or detailed reports about past pipeline runs, side-by-side comparisons of two runs, or confirmation of successful cleanup, all presented clearly by the AI agent.
Practical example
Example input
/runs compare 2026-02-23_acme-analytics_why-revenue-dropped-q3 2026-02-21_acme-analytics_activation-funnel-deep
Example output
Pipeline Runs (working/runs/)
| # | Date | Dataset | Title | Status | Agents |
|---|------------|-----------|--------------------------|-----------|--------|
| 1 | 2026-02-23 | acme-analytics | why-revenue-dropped-q3 | completed | 14/14 |
| 2 | 2026-02-21 | acme-analytics | activation-funnel-deep | failed | 8/14 |
| 3 | 2026-02-19 | hero | churn-by-segment | completed | 14/14 |
3 runs found. Use `/runs {#}` or `/runs {date_dataset_title}` for details.When to use this skill
- When you need to see a list of all past analytical pipeline runs.
- To inspect the details, status, or outputs of a specific historical analysis.
- To compare the results or states of two different pipeline executions side-by-side.
- When you want to clean up old, irrelevant pipeline runs to free up disk space.
When not to use this skill
- When initiating or executing a *new* analytical pipeline.
- To directly manipulate or edit files *within* an active, ongoing pipeline run.
- For general file system operations unrelated to pipeline run management.
- When deep content analysis of a run's output is required beyond comparison or status checks.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/runs/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Skill: Runs Compares
| Feature / Agent | Skill: Runs | Standard Approach |
|---|---|---|
| Platform Support | Claude | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | easy | N/A |
Frequently Asked Questions
What does this skill do?
## Purpose
Which AI agents support this skill?
This skill is designed for Claude.
How difficult is it to install?
The installation complexity is rated as easy. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# Skill: Runs
## Purpose
Browse, inspect, compare, and clean up past pipeline runs. Each run is a
self-contained directory under `working/runs/` with its own working files,
outputs, and pipeline state.
## When to Use
- User says `/runs`, `/runs list`, `/runs latest`, `/runs clean`, or `/runs compare`
- When the user wants to see what analyses have been executed
## Invocation
- `/runs` or `/runs list` -- list all past runs
- `/runs latest` -- show details of the most recent run
- `/runs {id}` -- show details of a specific run (partial match supported)
- `/runs clean` -- remove runs older than 30 days (confirmation required)
- `/runs compare {id1} {id2}` -- compare two runs side by side
## Instructions
### Step 1: Scan Run Directory
Read `working/runs/` directory. Each subdirectory is a run, named:
```
{YYYY-MM-DD}_{DATASET}_{SHORT_TITLE}/
```
For each run directory, read `pipeline_state.json` to extract:
- `pipeline_id` -- timestamp identifier
- `dataset` -- dataset name
- `question` -- the business question
- `status` -- `completed`, `failed`, `paused`, or `running`
- `run_dir` -- full path
- `started_at`, `completed_at` -- timing
- `steps` -- agent status map (to compute agent counts)
If `pipeline_state.json` is missing, infer status as `unknown` and derive
date/dataset from the directory name.
### Step 2: Execute Command
**List (`/runs` or `/runs list`):**
Display a table sorted by date descending:
```
Pipeline Runs (working/runs/)
| # | Date | Dataset | Title | Status | Agents |
|---|------------|-----------|--------------------------|-----------|--------|
| 1 | 2026-02-23 | acme-analytics | why-revenue-dropped-q3 | completed | 14/14 |
| 2 | 2026-02-21 | acme-analytics | activation-funnel-deep | failed | 8/14 |
| 3 | 2026-02-19 | hero | churn-by-segment | completed | 14/14 |
3 runs found. Use `/runs {#}` or `/runs {date_dataset_title}` for details.
```
The `Agents` column shows `{completed}/{total}` from the step map.
**Latest (`/runs latest`):**
Read `working/latest` symlink target. Display the detail view (same as `/runs {id}`).
**Detail (`/runs {id}`):**
Match `{id}` against run directory names (supports partial match -- e.g.,
`/runs acme-analytics` matches the most recent acme-analytics run). Display:
```
Run: {directory_name}
Status: {status}
Dataset: {dataset}
Question: {question}
Started: {started_at}
Completed: {completed_at} ({duration})
Agent Status:
completed: {list}
failed: {list with errors}
skipped: {list}
pending: {list}
Output Files:
- {RUN_DIR}/outputs/{file1}
- {RUN_DIR}/outputs/{file2}
...
Confidence: {grade from validation if available}
```
If the run has a validation report, extract and show the confidence grade.
**Clean (`/runs clean`):**
1. Identify runs older than 30 days (based on directory date prefix)
2. List them and ask for confirmation:
```
Found {N} runs older than 30 days:
- {dir1} (completed, {date})
- {dir2} (failed, {date})
Delete these runs? This cannot be undone. [y/N]
```
3. On confirmation, remove the directories
4. If `working/latest` pointed to a deleted run, remove the symlink
**Compare (`/runs compare {id1} {id2}`):**
Load `pipeline_state.json` and key output files from both runs. Display:
```
Comparing Runs:
A: {dir1}
B: {dir2}
| Dimension | Run A | Run B |
|--------------------|--------------------|--------------------|
| Date | {date_a} | {date_b} |
| Dataset | {dataset_a} | {dataset_b} |
| Status | {status_a} | {status_b} |
| Agents completed | {count_a} | {count_b} |
| Confidence grade | {grade_a} | {grade_b} |
| Charts generated | {chart_count_a} | {chart_count_b} |
| Key findings | {finding_count_a} | {finding_count_b} |
| Duration | {duration_a} | {duration_b} |
```
If both runs analyzed the same dataset, also compare:
- Top 3 findings from each (extracted from analysis reports)
- Any metrics that differ significantly
## Edge Cases
- **No runs directory:** Report "No pipeline runs found. Use `/run-pipeline` to start one."
- **Empty runs directory:** Same message as above
- **Corrupted pipeline_state.json:** Show run with `status: unknown`, note the error
- **Partial match ambiguity:** If multiple runs match, list them and ask user to be more specific
- **Legacy runs (no run directory):** Note: "Found legacy `working/pipeline_state.json` -- not in per-run format. Use `/run-pipeline` to create a tracked run."Related Skills
ai-pair
AI Pair Collaboration Skill. Coordinate multiple AI models to work together: one creates (Author/Developer), two others review (Codex + Gemini). Works for code, articles, video scripts, and any creative task. Trigger: /ai-pair, ai pair, dev-team, content-team, team-stop
review
Daily and weekly review workflows. USE WHEN user says "morning routine", "evening routine", "weekly review", "start my day", "end of day".
Beads Issue Tracking Skill
> **Attribution**: [Beads](https://github.com/steveyegge/beads) by [Steve Yegge](https://github.com/steveyegge)
prd
Generate a Product Requirements Document (PRD) for a new feature. Use when planning a feature, starting a new project, or when asked to create a PRD. Triggers on: create a prd, write prd for, plan this feature, requirements for, spec out.
Claude-Zeroclaw SKILL
## Overview
reprompter
Transform messy prompts into structured, effective prompts — single, multi-agent, or reverse-engineered from great outputs. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags, "reprompter teams", "repromptverse", "run with quality", "smart run", "smart agents", "campaign swarm", "engineering swarm", "ops swarm", "research swarm", multi-agent tasks, audits, parallel work, "reverse reprompt", "reprompt from example", "learn from this", "extract prompt from", "prompt dna", "prompt genome", reverse-engineering prompts from exemplar outputs. Don't use for simple Q&A, pure chat, or immediate execution-only tasks (see "Don't Use When" section). Outputs: structured XML/Markdown prompt, before/after quality score, optional team brief + per-agent sub-prompts, Agent Cards, Extraction Card (reverse mode). Target quality score: Single ≥ 7/10; Repromptverse per-agent ≥ 8/10; Reverse ≥ 7/10.
session-pack
세션 종료 시 Memory, Handoff를 자동 정리. /pack
execute
Main entry point for hierarchical task execution. Orchestrates layer-by-layer implementation of PRD tasks with parallel worktree execution.
textum
Textum PRD→Scaffold→Story workflow for Codex with low-noise outputs and gate checks.
sdd
This skill should be used when users want guidance on Spec-Driven Development methodology using GitHub's Spec-Kit. Guide users through executable specification workflows for both new projects (greenfield) and existing codebases (brownfield). After any SDD command generates artifacts, automatically provide structured 10-point summaries with feature status tracking, enabling natural language feature management and keeping users engaged throughout the process.
nonstop
Enables an autonomous work mode for AI agents, allowing continuous operation without user interruption. It includes a pre-flight risk assessment and intelligent blocker handling to maximize productivity.
superbuild
Use when executing implementation plans phase-by-phase with strict enforcement of quality gates, tests, and Definition of Done. Triggers on "build this plan", "execute plan", "implement phases", or when user provides a plan document to execute.