Skill: Runs

## Purpose

154 stars

Complexity: easy

About this skill

This skill empowers an AI agent to efficiently manage its own or a team's historical analytical pipeline executions. Each "run" is treated as a self-contained directory, typically located under `working/runs/`, encapsulating its specific working files, outputs, and pipeline state. The agent can list all previous runs, view detailed information for the latest or a specific run by ID, perform side-by-side comparisons of two runs, and intelligently clean up older runs to manage storage. Users leverage this skill to gain profound insights into their data analysis workflow. It helps in quickly reviewing what analyses have been performed, tracking the status of past computations (completed, failed, running), and understanding the evolution of a project. For instance, a data analyst could use it to revisit a previous quarterly revenue analysis or debug why a certain activation funnel deep-dive failed. By providing structured access to historical execution data, this skill enhances efficiency and reproducibility. It allows for quick auditing of past work, facilitates debugging by examining specific run states, and aids in resource management through automated cleanup. This capability is crucial for maintaining a clean and organized working environment for complex, iterative analytical tasks.

Best use case

The primary use case is for AI agents and human users to effectively manage and audit the historical execution of analytical or computational pipelines. This benefits data scientists, AI developers, and researchers who need a structured way to track, inspect, compare, and clean up previous attempts or successful runs, ensuring accountability, reproducibility, and efficient resource utilization within their projects.

## Purpose

A user can expect to receive structured lists or detailed reports about past pipeline runs, side-by-side comparisons of two runs, or confirmation of successful cleanup, all presented clearly by the AI agent.

Practical example

Example input

/runs compare 2026-02-23_acme-analytics_why-revenue-dropped-q3 2026-02-21_acme-analytics_activation-funnel-deep

Example output

Pipeline Runs (working/runs/)

| # | Date       | Dataset   | Title                    | Status    | Agents |
|---|------------|-----------|--------------------------|-----------|--------|
| 1 | 2026-02-23 | acme-analytics | why-revenue-dropped-q3   | completed | 14/14  |
| 2 | 2026-02-21 | acme-analytics | activation-funnel-deep   | failed    | 8/14   |
| 3 | 2026-02-19 | hero      | churn-by-segment         | completed | 14/14  |

3 runs found. Use `/runs {#}` or `/runs {date_dataset_title}` for details.

When to use this skill

When you need to see a list of all past analytical pipeline runs.
To inspect the details, status, or outputs of a specific historical analysis.
To compare the results or states of two different pipeline executions side-by-side.
When you want to clean up old, irrelevant pipeline runs to free up disk space.

When not to use this skill

When initiating or executing a *new* analytical pipeline.
To directly manipulate or edit files *within* an active, ongoing pipeline run.
For general file system operations unrelated to pipeline run management.
When deep content analysis of a run's output is required beyond comparison or status checks.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/runs/SKILL.md --create-dirs "https://raw.githubusercontent.com/ai-analyst-lab/ai-analyst/main/.claude/skills/runs/skill.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/runs/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How Skill: Runs Compares

Feature / Agent	Skill: Runs	Standard Approach
Platform Support	Claude	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	easy	N/A

Frequently Asked Questions

What does this skill do?

## Purpose

Which AI agents support this skill?

This skill is designed for Claude.

How difficult is it to install?

The installation complexity is rated as easy. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

ChatGPT vs Claude for Agent Skills

Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.

SKILL.md Source

# Skill: Runs

## Purpose
Browse, inspect, compare, and clean up past pipeline runs. Each run is a
self-contained directory under `working/runs/` with its own working files,
outputs, and pipeline state.

## When to Use
- User says `/runs`, `/runs list`, `/runs latest`, `/runs clean`, or `/runs compare`
- When the user wants to see what analyses have been executed

## Invocation
- `/runs` or `/runs list` -- list all past runs
- `/runs latest` -- show details of the most recent run
- `/runs {id}` -- show details of a specific run (partial match supported)
- `/runs clean` -- remove runs older than 30 days (confirmation required)
- `/runs compare {id1} {id2}` -- compare two runs side by side

## Instructions

### Step 1: Scan Run Directory

Read `working/runs/` directory. Each subdirectory is a run, named:
```
{YYYY-MM-DD}_{DATASET}_{SHORT_TITLE}/
```

For each run directory, read `pipeline_state.json` to extract:
- `pipeline_id` -- timestamp identifier
- `dataset` -- dataset name
- `question` -- the business question
- `status` -- `completed`, `failed`, `paused`, or `running`
- `run_dir` -- full path
- `started_at`, `completed_at` -- timing
- `steps` -- agent status map (to compute agent counts)

If `pipeline_state.json` is missing, infer status as `unknown` and derive
date/dataset from the directory name.

### Step 2: Execute Command

**List (`/runs` or `/runs list`):**

Display a table sorted by date descending:

```
Pipeline Runs (working/runs/)

| # | Date       | Dataset   | Title                    | Status    | Agents |
|---|------------|-----------|--------------------------|-----------|--------|
| 1 | 2026-02-23 | acme-analytics | why-revenue-dropped-q3   | completed | 14/14  |
| 2 | 2026-02-21 | acme-analytics | activation-funnel-deep   | failed    | 8/14   |
| 3 | 2026-02-19 | hero      | churn-by-segment         | completed | 14/14  |

3 runs found. Use `/runs {#}` or `/runs {date_dataset_title}` for details.
```

The `Agents` column shows `{completed}/{total}` from the step map.

**Latest (`/runs latest`):**

Read `working/latest` symlink target. Display the detail view (same as `/runs {id}`).

**Detail (`/runs {id}`):**

Match `{id}` against run directory names (supports partial match -- e.g.,
`/runs acme-analytics` matches the most recent acme-analytics run). Display:

```
Run: {directory_name}
Status: {status}
Dataset: {dataset}
Question: {question}
Started: {started_at}
Completed: {completed_at} ({duration})

Agent Status:
  completed: {list}
  failed: {list with errors}
  skipped: {list}
  pending: {list}

Output Files:
  - {RUN_DIR}/outputs/{file1}
  - {RUN_DIR}/outputs/{file2}
  ...

Confidence: {grade from validation if available}
```

If the run has a validation report, extract and show the confidence grade.

**Clean (`/runs clean`):**

1. Identify runs older than 30 days (based on directory date prefix)
2. List them and ask for confirmation:
   ```
   Found {N} runs older than 30 days:
     - {dir1} (completed, {date})
     - {dir2} (failed, {date})

   Delete these runs? This cannot be undone. [y/N]
   ```
3. On confirmation, remove the directories
4. If `working/latest` pointed to a deleted run, remove the symlink

**Compare (`/runs compare {id1} {id2}`):**

Load `pipeline_state.json` and key output files from both runs. Display:

```
Comparing Runs:
  A: {dir1}
  B: {dir2}

| Dimension          | Run A              | Run B              |
|--------------------|--------------------|--------------------|
| Date               | {date_a}           | {date_b}           |
| Dataset            | {dataset_a}        | {dataset_b}        |
| Status             | {status_a}         | {status_b}         |
| Agents completed   | {count_a}          | {count_b}          |
| Confidence grade   | {grade_a}          | {grade_b}          |
| Charts generated   | {chart_count_a}    | {chart_count_b}    |
| Key findings       | {finding_count_a}  | {finding_count_b}  |
| Duration           | {duration_a}       | {duration_b}       |
```

If both runs analyzed the same dataset, also compare:
- Top 3 findings from each (extracted from analysis reports)
- Any metrics that differ significantly

## Edge Cases
- **No runs directory:** Report "No pipeline runs found. Use `/run-pipeline` to start one."
- **Empty runs directory:** Same message as above
- **Corrupted pipeline_state.json:** Show run with `status: unknown`, note the error
- **Partial match ambiguity:** If multiple runs match, list them and ask user to be more specific
- **Legacy runs (no run directory):** Note: "Found legacy `working/pipeline_state.json` -- not in per-run format. Use `/run-pipeline` to create a tracked run."

Related Skills

ai-pair

169

from axtonliu/ai-pair

AI Pair Collaboration Skill. Coordinate multiple AI models to work together: one creates (Author/Developer), two others review (Codex + Gemini). Works for code, articles, video scripts, and any creative task. Trigger: /ai-pair, ai pair, dev-team, content-team, team-stop

Workflow & ProductivityClaude

review

167

from ArtemXTech/claude-code-obsidian-starter

Daily and weekly review workflows. USE WHEN user says "morning routine", "evening routine", "weekly review", "start my day", "end of day".

Workflow & ProductivityClaude

Beads Issue Tracking Skill

164

from maslennikov-ig/claude-code-orchestrator-kit

> **Attribution**: [Beads](https://github.com/steveyegge/beads) by [Steve Yegge](https://github.com/steveyegge)

Workflow & ProductivityClaude

prd

160

from react-native-vibe-code/react-native-vibe-code-sdk

Generate a Product Requirements Document (PRD) for a new feature. Use when planning a feature, starting a new project, or when asked to create a PRD. Triggers on: create a prd, write prd for, plan this feature, requirements for, spec out.

Workflow & ProductivityClaude

Claude-Zeroclaw SKILL

109

from Crestdrasnip/Claude-Zeroclaw

## Overview

Workflow & ProductivityClaude

reprompter

from AytuncYildizli/reprompter

Transform messy prompts into structured, effective prompts — single, multi-agent, or reverse-engineered from great outputs. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags, "reprompter teams", "repromptverse", "run with quality", "smart run", "smart agents", "campaign swarm", "engineering swarm", "ops swarm", "research swarm", multi-agent tasks, audits, parallel work, "reverse reprompt", "reprompt from example", "learn from this", "extract prompt from", "prompt dna", "prompt genome", reverse-engineering prompts from exemplar outputs. Don't use for simple Q&A, pure chat, or immediate execution-only tasks (see "Don't Use When" section). Outputs: structured XML/Markdown prompt, before/after quality score, optional team brief + per-agent sub-prompts, Agent Cards, Extraction Card (reverse mode). Target quality score: Single ≥ 7/10; Repromptverse per-agent ≥ 8/10; Reverse ≥ 7/10.

Workflow & ProductivityClaudeCodex

session-pack

from ten-builder/ten-builder

세션 종료 시 Memory, Handoff를 자동 정리. /pack

Workflow & ProductivityClaude

execute

from vinzenz/prd-breakdown-execute

Main entry point for hierarchical task execution. Orchestrates layer-by-layer implementation of PRD tasks with parallel worktree execution.

Workflow & ProductivityClaude

textum

from snakeying/Textum

Textum PRD→Scaffold→Story workflow for Codex with low-noise outputs and gate checks.

Workflow & ProductivityClaude

sdd

from SpillwaveSolutions/agent_rulez

This skill should be used when users want guidance on Spec-Driven Development methodology using GitHub's Spec-Kit. Guide users through executable specification workflows for both new projects (greenfield) and existing codebases (brownfield). After any SDD command generates artifacts, automatically provide structured 10-point summaries with feature status tracking, enabling natural language feature management and keeping users engaged throughout the process.

Workflow & ProductivityClaude

nonstop

from andylizf/nonstop

Enables an autonomous work mode for AI agents, allowing continuous operation without user interruption. It includes a pre-flight risk assessment and intelligent blocker handling to maximize productivity.

Workflow & ProductivityClaude

superbuild

from asteroid-belt/skulto

Use when executing implementation plans phase-by-phase with strict enforcement of quality gates, tests, and Definition of Done. Triggers on "build this plan", "execute plan", "implement phases", or when user provides a plan document to execute.

Workflow & ProductivityClaude