qcsd-production-swarm

Use when assessing post-release production health with DORA metrics, root cause analysis, defect prediction, or cross-phase feedback loops in the QCSD Production phase.

Best use case

qcsd-production-swarm is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Use when assessing post-release production health with DORA metrics, root cause analysis, defect prediction, or cross-phase feedback loops in the QCSD Production phase.

Teams using qcsd-production-swarm should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/qcsd-production-swarm/SKILL.md --create-dirs "https://raw.githubusercontent.com/proffesor-for-testing/agentic-qe/main/.claude/skills/qcsd-production-swarm/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/qcsd-production-swarm/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How qcsd-production-swarm Compares

Feature / Agentqcsd-production-swarmStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Use when assessing post-release production health with DORA metrics, root cause analysis, defect prediction, or cross-phase feedback loops in the QCSD Production phase.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# QCSD Production Swarm v1.0

Post-release production health assessment and QCSD feedback loop closure.

---

## Overview

The Production Swarm assesses release health in the live production environment using
DORA metrics, incident RCA, defect prediction, and cross-phase feedback loops. It renders
a HEALTHY / DEGRADED / CRITICAL decision and is the only QCSD phase with dual
responsibility: assessing current production health AND closing the feedback loop
back to Ideation and Refinement phases.

### QCSD Phase Positioning

| Phase | Swarm | Decision | When |
|-------|-------|----------|------|
| Ideation | qcsd-ideation-swarm | GO / CONDITIONAL / NO-GO | PI/Sprint Planning |
| Refinement | qcsd-refinement-swarm | READY / CONDITIONAL / NOT-READY | Sprint Refinement |
| Development | qcsd-development-swarm | SHIP / CONDITIONAL / HOLD | During Sprint |
| Verification | qcsd-cicd-swarm | RELEASE / REMEDIATE / BLOCK | Pre-Release / CI-CD |
| **Production** | **qcsd-production-swarm** | **HEALTHY / DEGRADED / CRITICAL** | **Post-Release** |

### Parameters

- `TELEMETRY_DATA`: Path to production telemetry, incident reports, and DORA metrics (required)
- `RELEASE_ID`: Release identifier for tracking (optional)
- `OUTPUT_FOLDER`: Where to save reports (default: `${PROJECT_ROOT}/Agentic QCSD/production/`)
- `SLA_DEFINITIONS`: Path to SLA/SLO target definitions (optional)

---

## ENFORCEMENT RULES - READ FIRST

| Rule | Enforcement |
|------|-------------|
| **E1** | You MUST spawn ALL THREE core agents in Step 2. No exceptions. |
| **E2** | You MUST put all parallel Task calls in a SINGLE message. |
| **E3** | You MUST STOP and WAIT after each batch. No proceeding early. |
| **E4** | You MUST spawn conditional agents if flags are TRUE. No skipping. |
| **E5** | You MUST apply HEALTHY/DEGRADED/CRITICAL logic exactly as specified in Step 5. |
| **E6** | You MUST generate the full report structure. No abbreviated versions. |
| **E7** | Each agent MUST read its reference files before analysis. |
| **E8** | You MUST run BOTH feedback agents in Step 8 SEQUENTIALLY. Always. Both agents. |
| **E9** | You MUST execute Step 7 learning persistence. No skipping. |

**PROHIBITED BEHAVIORS:**
- Summarizing instead of spawning agents
- Skipping agents "for brevity"
- Proceeding before background tasks complete
- Providing your own analysis instead of spawning specialists
- Omitting report sections or using placeholder text

---

## Step Execution Protocol

This skill uses a micro-file step architecture. Each step is a self-contained file
loaded one at a time to avoid "lost in the middle" context degradation.

**Execute steps sequentially by reading each step file with the Read tool.**

### Steps

1. **Flag Detection** -- `steps/01-flag-detection.md` -- Retrieve CI/CD signals, detect telemetry source, evaluate all 7 flags
2. **Core Agents** -- `steps/02-core-agents.md` -- Spawn qe-metrics-optimizer, qe-defect-predictor, qe-root-cause-analyzer in parallel
3. **Batch 1 Results** -- `steps/03-batch1-results.md` -- Wait for core agents, extract all metrics
4. **Conditional Agents** -- `steps/04-conditional-agents.md` -- Spawn flagged conditional agents in parallel
5. **Decision Synthesis** -- `steps/05-decision-synthesis.md` -- Apply HEALTHY/DEGRADED/CRITICAL logic
6. **Report Generation** -- `steps/06-report-generation.md` -- Generate executive summary and full report
7. **Learning Persistence** -- `steps/07-learning-persistence.md` -- Store findings to memory, save persistence record
8. **Feedback Loop** -- `steps/08-feedback-loop.md` -- Run learning coordinator then transfer specialist (sequential)
9. **Final Output** -- `steps/09-final-output.md` -- Display completion summary with all scores

### Execution Instructions

1. Use the Read tool to load the current step file (e.g., `Read({ file_path: ".claude/skills/qcsd-production-swarm/steps/01-flag-detection.md" })`)
2. Execute the step's instructions completely
3. Verify all success criteria are met before proceeding
4. Pass the step's output as context to the next step
5. If a step fails, halt and report the failure point -- do not skip ahead

### Resume Support

To resume from a specific step: specify `--from-step N` and the orchestrator will
skip to step N. Ensure you have the required prerequisite data from prior steps.

---

## Agent Inventory

| Agent | Type | Domain | Batch |
|-------|------|--------|-------|
| qe-metrics-optimizer | Core (always) | learning-optimization | 1 |
| qe-defect-predictor | Core (always) | defect-intelligence | 1 |
| qe-root-cause-analyzer | Core (always) | defect-intelligence | 1 |
| qe-chaos-engineer | Conditional (HAS_INFRASTRUCTURE_CHANGE) | chaos-resilience | 2 |
| qe-performance-tester | Conditional (HAS_PERFORMANCE_SLA) | chaos-resilience | 2 |
| qe-regression-analyzer | Conditional (HAS_REGRESSION_RISK) | defect-intelligence | 2 |
| qe-pattern-learner | Conditional (HAS_RECURRING_INCIDENTS) | defect-intelligence | 2 |
| qe-middleware-validator | Conditional (HAS_MIDDLEWARE) | enterprise-integration | 2 |
| qe-sap-rfc-tester | Conditional (HAS_SAP_INTEGRATION) | enterprise-integration | 2 |
| qe-sod-analyzer | Conditional (HAS_AUTHORIZATION) | enterprise-integration | 2 |
| qe-learning-coordinator | Feedback (always, sequential) | learning-optimization | 3 |
| qe-transfer-specialist | Feedback (always, sequential) | learning-optimization | 3 |

**Total: 12 agents (3 core + 7 conditional + 2 feedback)**

---

## Quality Gate Thresholds

| Metric | HEALTHY | DEGRADED | CRITICAL |
|--------|---------|----------|----------|
| DORA Score | >= 0.7 | 0.4 - 0.69 | < 0.4 |
| SLA Compliance | >= 99% | 95 - 98.9% | < 95% |
| Incident Severity | P3/P4/NONE | P2 | P0/P1 |
| Defect Trend | declining/stable | stable (density > 2) | increasing + density > 5 |
| RCA Completeness | >= 80% | 50 - 79% | < 50% |

---

## Report Filename Mapping

| Agent | Report Filename | Step |
|-------|----------------|------|
| qe-metrics-optimizer | `02-dora-metrics.md` | 2 |
| qe-defect-predictor | `03-defect-prediction.md` | 2 |
| qe-root-cause-analyzer | `04-root-cause-analysis.md` | 2 |
| qe-chaos-engineer | `05-chaos-resilience.md` | 4 |
| qe-performance-tester | `06-performance-sla.md` | 4 |
| qe-regression-analyzer | `07-regression-analysis.md` | 4 |
| qe-pattern-learner | `08-pattern-analysis.md` | 4 |
| Learning Persistence | `09-learning-persistence.json` | 7 |
| qe-middleware-validator | `10-middleware-health.md` | 4 |
| qe-sap-rfc-tester | `11-sap-health.md` | 4 |
| qe-sod-analyzer | `12-sod-compliance.md` | 4 |
| Feedback agents | `13-feedback-loops.md` | 8 |
| Synthesis | `01-executive-summary.md` | 6 |

---

## Execution Model Options

| Model | When to Use | Agent Spawn |
|-------|-------------|-------------|
| **Task Tool** (PRIMARY) | Claude Code sessions | `Task({ subagent_type, run_in_background: true })` |
| **MCP Tools** | MCP server available | `fleet_init({})` / `task_submit({})` |
| **CLI** | Terminal/scripts | `swarm init` / `agent spawn` |

---

## Key Principle

**Production health is measured by outcomes, not intentions. This swarm provides
evidence-based production assessment and closes the QCSD feedback loop.**

Related Skills

V3 Swarm Coordination

298
from proffesor-for-testing/agentic-qe

15-agent hierarchical mesh coordination for v3 implementation. Orchestrates parallel execution across security, core, and integration domains following 10 ADRs with 14-week timeline.

Swarm Orchestration

298
from proffesor-for-testing/agentic-qe

Orchestrate multi-agent swarms with agentic-flow for parallel task execution, dynamic topology, and intelligent coordination. Use when scaling beyond single agents, implementing complex workflows, or building distributed AI systems.

swarm-advanced

298
from proffesor-for-testing/agentic-qe

Advanced swarm orchestration patterns for research, development, testing, and complex distributed workflows

qcsd-refinement-swarm

298
from proffesor-for-testing/agentic-qe

Use when running Sprint Refinement sessions with SFDIPOT product factors, generating BDD scenarios, or validating requirements in the QCSD Refinement phase.

qcsd-ideation-swarm

298
from proffesor-for-testing/agentic-qe

Use when running Quality Criteria sessions during PI/Sprint planning with HTSM v6.3, Risk Storming, or Testability analysis in the QCSD Ideation phase.

qcsd-development-swarm

298
from proffesor-for-testing/agentic-qe

Use when monitoring in-sprint code quality with TDD adherence checks, complexity analysis, coverage gap detection, or defect prediction in the QCSD Development phase.

qcsd-cicd-swarm

298
from proffesor-for-testing/agentic-qe

Use when enforcing CI/CD quality gates before release, running regression analysis, detecting flaky tests, or assessing deployment readiness in the QCSD Verification phase.

flow-nexus-swarm

298
from proffesor-for-testing/agentic-qe

Cloud-based AI swarm deployment and event-driven workflow automation with Flow Nexus platform

qe-visual-testing-advanced

298
from proffesor-for-testing/agentic-qe

Advanced visual regression testing with pixel-perfect comparison, AI-powered diff analysis, responsive design validation, and cross-browser visual consistency. Use when detecting UI regressions, validating designs, or ensuring visual consistency.

qe-verification-quality

298
from proffesor-for-testing/agentic-qe

Comprehensive truth scoring, code quality verification, and automatic rollback system with 0.95 accuracy threshold for ensuring high-quality agent outputs and codebase reliability.

qe-testability-scoring

298
from proffesor-for-testing/agentic-qe

AI-powered testability assessment using 10 principles of intrinsic testability with Playwright and optional Vibium integration. Evaluates web applications against Observability, Controllability, Algorithmic Simplicity, Transparency, Stability, Explainability, Unbugginess, Smallness, Decomposability, and Similarity. Use when assessing software testability, evaluating test readiness, identifying testability improvements, or generating testability reports.

qe-test-reporting-analytics

298
from proffesor-for-testing/agentic-qe

Advanced test reporting, quality dashboards, predictive analytics, trend analysis, and executive reporting for QE metrics. Use when communicating quality status, tracking trends, or making data-driven decisions.