incident-postmortem

A full pipeline where an agent team collaborates to generate incident postmortem reports. Systematically performs timeline reconstruction, root cause analysis, impact assessment, and remediation planning. Use this skill for requests like 'write an incident postmortem', 'post-incident analysis report', 'create an incident report', 'incident report', 'root cause analysis', 'RCA report', 'organize incident timeline', 'establish remediation measures', and other incident analysis tasks. Note: real-time incident response (on-call), monitoring system setup, and alert configuration are outside the scope of this skill.

495 stars

byrevfactory

View on GitHub Installation ↓

Best use case

incident-postmortem is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using incident-postmortem should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/incident-postmortem/SKILL.md --create-dirs "https://raw.githubusercontent.com/revfactory/harness-100/main/en/25-incident-postmortem/.claude/skills/incident-postmortem/skill.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/incident-postmortem/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How incident-postmortem Compares

Feature / Agent	incident-postmortem	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Incident Postmortem — Incident Post-Analysis Pipeline

An agent team collaborates to perform timeline reconstruction -> root cause analysis -> impact assessment -> remediation planning -> report generation.

## Execution Mode

**Agent Team** — 5 members communicate directly via SendMessage and cross-validate each other.

## Agent Composition

| Agent | File | Role | Type |
|-------|------|------|------|
| timeline-reconstructor | `.claude/agents/timeline-reconstructor.md` | Event collection, chronological ordering, gap identification | general-purpose |
| root-cause-investigator | `.claude/agents/root-cause-investigator.md` | 5 Whys, Fishbone, Fault Tree | general-purpose |
| impact-assessor | `.claude/agents/impact-assessor.md` | User/revenue/SLA/reputation impact assessment | general-purpose |
| remediation-planner | `.claude/agents/remediation-planner.md` | Short/mid/long-term countermeasures, action items | general-purpose |
| postmortem-reviewer | `.claude/agents/postmortem-reviewer.md` | Cross-validation, blameless culture verification | general-purpose |

## Workflow

### Phase 1: Preparation (Performed directly by Orchestrator)

1. Extract from user input:
    - **Incident Description**: What happened and when
    - **Evidence** (optional): Logs, metric screenshots, chat records, alert records
    - **Impact Information** (optional): Number of affected users, services, duration
    - **Actions Taken** (optional): Emergency measures already performed
2. Create `_workspace/` directory at the project root
3. Organize input and save to `_workspace/00_input.md`
4. If existing files are available, copy them to `_workspace/` and skip the corresponding Phase
5. Determine **execution mode** based on the scope of the request (see "Modes by Task Scale" below)

### Phase 2: Team Assembly and Execution

| Order | Task | Assignee | Dependencies | Deliverable |
|-------|------|----------|-------------|-------------|
| 1 | Timeline Reconstruction | reconstructor | None | `_workspace/01_timeline.md` |
| 2a | Root Cause Analysis | investigator | Task 1 | `_workspace/02_root_cause.md` |
| 2b | Impact Assessment | assessor | Task 1 | `_workspace/03_impact_assessment.md` |
| 3 | Remediation Planning | planner | Tasks 2a, 2b | `_workspace/04_remediation_plan.md` |
| 4 | Final Review | reviewer | Tasks 1-3 | `_workspace/05_review_report.md` |

Tasks 2a (root cause) and 2b (impact) can be **executed in parallel**.

**Inter-team Communication Flow:**
- reconstructor completes -> delivers timeline and trigger candidates to investigator; delivers incident duration and metrics to assessor
- investigator completes -> delivers root cause and contributing factors to planner
- assessor completes -> delivers impact magnitude and SLA violation status to planner
- planner completes -> delivers full countermeasures to reviewer
- reviewer cross-validates all deliverables. Requests fixes for RED Must Fix items (up to 2 times)

### Phase 3: Integrated Report Generation

1. Generate `_workspace/postmortem_report.md` integrating all deliverables
2. Report structure: Summary -> Timeline -> Root Cause -> Impact -> Remediation -> What Went Well -> Lessons Learned
3. Deliver the final report to the user

## Modes by Task Scale

| User Request Pattern | Execution Mode | Deployed Agents |
|---------------------|----------------|-----------------|
| "Write a postmortem report" | **Full Pipeline** | All 5 agents |
| "Organize the incident timeline" | **Timeline Mode** | reconstructor + reviewer |
| "Analyze the root cause" | **RCA Mode** | reconstructor + investigator + reviewer |
| "Just create remediation measures" (cause analysis exists) | **Remediation Mode** | planner + reviewer |
| "Review this postmortem" | **Review Mode** | reviewer only |

**Leveraging Existing Files**: If the user provides existing timelines, cause analyses, etc., copy the files to the appropriate location in `_workspace/` and skip the corresponding agent's step.

## Data Transfer Protocol

| Strategy | Method | Purpose |
|----------|--------|---------|
| File-based | `_workspace/` directory | Store and share main deliverables |
| Message-based | SendMessage | Real-time delivery of key information, fix requests |
| Task-based | TaskCreate/TaskUpdate | Progress tracking, dependency management |

## Error Handling

| Error Type | Strategy |
|-----------|----------|
| Insufficient incident information | Ask user additional questions, tag uncertain parts with "[Unconfirmed]" |
| Logs/metrics inaccessible | Reconstruct from verbal accounts, tag with "[Verbal account-based]" |
| Agent failure | Retry once -> if fails, proceed without that deliverable, note omission in review |
| RED found in review | Request fix from relevant agent -> rework -> re-verify (up to 2 times) |
| Blaming language found | Reviewer immediately requests fix — blameless culture is an absolute principle |

## Test Scenarios

### Normal Flow
**Prompt**: "Yesterday at 2 PM the payment service was down for 30 minutes. It happened right after a deployment and was recovered by rollback. Create a postmortem report."
**Expected Result**:
- Timeline: Deployment -> incident start -> detection -> response -> rollback -> recovery in chronological order
- Root Cause: Specific defect in deployed code, contributing factors like no canary deployment
- Impact: User count, revenue loss, SLA impact estimates
- Remediation: SMART action items like canary deployment adoption, auto-rollback, alert improvements
- Integrated Report: Complete postmortem ready for executive reporting

### Existing File Utilization Flow
**Prompt**: "Review this postmortem report" + report attached
**Expected Result**:
- Copy existing report to `_workspace/`
- Execute in review mode
- Verify consistency, completeness, and blameless culture adherence
- Provide improvement suggestions

### Error Flow
**Prompt**: "The API server was slow this morning. Analyze the cause."
**Expected Result**:
- Collect incident details through additional questions (time, impact, actions, etc.)
- Execute RCA mode with collected information
- Clearly mark uncertain parts

## Agent Extension Skills

| Skill | Path | Enhanced Agent | Role |
|-------|------|---------------|------|
| rca-methodology | `.claude/skills/rca-methodology/skill.md` | root-cause-investigator | 5 Whys, Fishbone, Fault Tree, change analysis, cognitive bias prevention |
| sla-impact-calculator | `.claude/skills/sla-impact-calculator/skill.md` | impact-assessor | SLA/SLO/SLI framework, error budgets, revenue loss estimation, severity levels |

Related Skills

sustainability-audit

495

from revfactory/harness-100

Full audit pipeline for ESG/sustainability where an agent team collaborates to generate environmental, social, and governance assessments along with an integrated report and improvement plan. Use this skill for requests such as 'run an ESG audit', 'write a sustainability report', 'ESG assessment', 'carbon emissions calculation', 'ESG rating diagnosis', 'governance review', 'social responsibility assessment', 'GRI report', 'TCFD disclosure', 'ESG improvement plan', and other ESG/sustainability tasks. Also supports assessment of specific pillars (E/S/G) only or improving existing reports. However, actual on-site audit execution, third-party verification certificate issuance, ESG rating agency score changes, and carbon credit trading are outside the scope of this skill.

materiality-assessment

495

from revfactory/harness-100

ESG materiality assessment matrix. Referenced by the esg-reporter and improvement-planner agents when evaluating ESG issue materiality and setting priorities. Use for 'materiality assessment', 'importance analysis', or 'Materiality Matrix' requests. Stakeholder surveys and external certification are out of scope.

ghg-protocol

495

from revfactory/harness-100

GHG Protocol detailed guide. Referenced by the environmental-analyst agent when calculating and reporting greenhouse gas emissions. Use for 'GHG Protocol', 'carbon emissions', 'Scope 1/2/3', or 'carbon footprint' requests. Carbon credit trading and CDM project execution are out of scope.

citation-standards

495

from revfactory/harness-100

Academic citation and reference standards guide. Referenced by the paper-writer and submission-preparer agents when composing citations and references. Use for 'citation format', 'APA', or 'references' requests. Original paper retrieval and professional database access are out of scope.

academic-paper

495

from revfactory/harness-100

Full research pipeline for academic paper writing where an agent team collaborates to generate research design, experiment protocols, analysis, manuscript writing, and submission preparation. Use this skill for requests such as 'write an academic paper', 'research paper writing', 'help me write a paper', 'design a study', 'run statistical analysis', 'prepare journal submission', 'manuscript writing', 'research methodology design', 'hypothesis testing', 'academic writing', and other academic research paper tasks. Also supports analysis, rewriting, and submission preparation when existing data or drafts are available. However, actual data collection execution, official IRB submission, journal system login and upload, and running actual statistical software are outside the scope of this skill.

product-copy-formulas

495

from revfactory/harness-100

Product copy formula library. Referenced by the detail-page-writer and marketing-manager agents when writing purchase-driving copy. Use for 'product copy', 'marketing copy', or 'ad copy' requests. Ad placement and design mockup creation are out of scope.

ecommerce-launcher

495

from revfactory/harness-100

Full launch pipeline for e-commerce products where an agent team collaborates to generate product planning, detail pages, pricing strategy, marketing, and CS setup all at once. Use this skill for requests such as 'launch an e-commerce product', 'prepare a product launch', 'register a product on Naver Smart Store', 'launch on Coupang', 'create a detail page', 'develop a pricing strategy', 'create a marketing plan', 'launch prep', 'product planning brief', 'e-commerce CS manual', and other e-commerce product launch tasks. Also supports supplementing pricing/marketing/CS even when existing briefs or detail pages are provided. However, actual platform API integration (automated product registration), payment system development, logistics system integration, and real-time order management are outside the scope of this skill.

conversion-optimization

495

from revfactory/harness-100

Purchase conversion optimization framework. Referenced by the detail-page-writer and pricing-strategist agents when designing detail pages and pricing with a conversion focus. Use for 'conversion rate optimization', 'CRO', or 'purchase psychology' requests. A/B testing tool setup and funnel automation are out of scope.

real-estate-analyst

495

from revfactory/harness-100

Real estate investment analysis pipeline. An agent team collaborates to produce market research, location analysis, profitability analysis, risk assessment, and investment reports. Use this skill for requests such as 'analyze this real estate', 'apartment investment analysis', 'studio apartment yield', 'real estate market research', 'location analysis', 'real estate investment report', 'buy vs lease', 'reconstruction investment analysis', 'commercial property yield analysis', and other general real estate investment analysis tasks. Actual purchase contracts, brokerage services, interior design, and property management are outside the scope of this skill.

location-scoring

495

from revfactory/harness-100

Location scoring scorecard. Referenced by the location-analyst agent for systematic real estate location evaluation. Use for requests involving 'location analysis', 'location assessment', or 'commercial area analysis'. On-site inspections and surveying are out of scope.

cap-rate-calculator

495

from revfactory/harness-100

Real estate yield calculator. Reference formulas and models used by the profitability-analyst agent for quantitative investment return analysis. Use for requests involving 'Cap Rate', 'yield analysis', 'DCF', or 'cash flow analysis'. Tax advisory and loan underwriting are out of scope.

vendor-scoring

495

from revfactory/harness-100

Vendor evaluation scorecard framework. Referenced by vendor-comparator and evaluation-designer agents when systematically comparing and evaluating vendors. Used for 'vendor evaluation', 'supplier comparison', 'bid evaluation' requests. Note: posting bid announcements and executing contracts are out of scope.