ci-validation-gates

Defensive CI/CD patterns: semver validation, token checks, retry logic, draft detection — earned from v0.8.22

1,828 stars

Best use case

ci-validation-gates is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Defensive CI/CD patterns: semver validation, token checks, retry logic, draft detection — earned from v0.8.22

Teams using ci-validation-gates should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ci-validation-gates/SKILL.md --create-dirs "https://raw.githubusercontent.com/bradygaster/squad/main/.copilot/skills/ci-validation-gates/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/ci-validation-gates/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How ci-validation-gates Compares

Feature / Agentci-validation-gatesStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Defensive CI/CD patterns: semver validation, token checks, retry logic, draft detection — earned from v0.8.22

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

## Context

CI workflows must be defensive. These patterns were learned from the v0.8.22 release disaster where invalid semver, wrong token types, missing retry logic, and draft releases caused a multi-hour outage. Both Drucker (CI/CD) and Trejo (Release Manager) carried this knowledge in their charters — now centralized here.

## Patterns

### Semver Validation Gate
Every publish workflow MUST validate version format before `npm publish`. 4-part versions (e.g., 0.8.21.4) are NOT valid semver — npm mangles them.

```yaml
- name: Validate semver
  run: |
    VERSION="${{ github.event.release.tag_name }}"
    VERSION="${VERSION#v}"
    if ! npx semver "$VERSION" > /dev/null 2>&1; then
      echo "❌ Invalid semver: $VERSION"
      echo "Only 3-part versions (X.Y.Z) or prerelease (X.Y.Z-tag.N) are valid."
      exit 1
    fi
    echo "✅ Valid semver: $VERSION"
```

### NPM Token Type Verification
NPM_TOKEN MUST be an Automation token, not a User token with 2FA:
- User tokens require OTP — CI can't provide it → EOTP error
- Create Automation tokens at npmjs.com → Settings → Access Tokens → Automation
- Verify before first publish in any workflow

### Retry Logic for npm Registry Propagation
npm registry uses eventual consistency. After `npm publish` succeeds, the package may not be immediately queryable.
- Propagation: typically 5-30s, up to 2min in rare cases
- All verify steps: 5 attempts, 15-second intervals
- Log each attempt: "Attempt 1/5: Checking package..."
- Exit loop on success, fail after max attempts

```yaml
- name: Verify package (with retry)
  run: |
    MAX_ATTEMPTS=5
    WAIT_SECONDS=15
    for attempt in $(seq 1 $MAX_ATTEMPTS); do
      echo "Attempt $attempt/$MAX_ATTEMPTS: Checking $PACKAGE@$VERSION..."
      if npm view "$PACKAGE@$VERSION" version > /dev/null 2>&1; then
        echo "✅ Package verified"
        exit 0
      fi
      [ $attempt -lt $MAX_ATTEMPTS ] && sleep $WAIT_SECONDS
    done
    echo "❌ Failed to verify after $MAX_ATTEMPTS attempts"
    exit 1
```

### Draft Release Detection
Draft releases don't emit `release: published` event. Workflows MUST:
- Trigger on `release: published` (NOT `created`)
- If using workflow_dispatch: verify release is published via GitHub API before proceeding

### Build Script Protection
Set `SKIP_BUILD_BUMP=1` (or `$env:SKIP_BUILD_BUMP = "1"` on Windows) before ANY release build. bump-build.mjs is for dev builds ONLY — it silently mutates versions.

## Known Failure Modes (v0.8.22 Incident)

| # | What Happened | Root Cause | Prevention |
|---|---------------|-----------|------------|
| 1 | 4-part version published, npm mangled it | No semver validation gate | `npx semver` check before every publish |
| 2 | CI failed 5+ times with EOTP | User token with 2FA | Automation token only |
| 3 | Verify returned false 404 | No retry logic for propagation | 5 attempts, 15s intervals |
| 4 | Workflow never triggered | Draft release doesn't emit event | Never create draft releases |
| 5 | Version mutated during release | bump-build.mjs ran in release | SKIP_BUILD_BUMP=1 |

## Anti-Patterns
- ❌ Publishing without semver validation gate
- ❌ Single-shot verification without retry
- ❌ Hard-coded secrets in workflows
- ❌ Silent CI failures — every error needs actionable output with remediation
- ❌ Assuming npm publish is instantly queryable

Related Skills

My Skill

1828
from bradygaster/squad

No description provided.

rework-rate

1828
from bradygaster/squad

Measure and interpret PR rework rate — the emerging 5th DORA metric

project-conventions

1828
from bradygaster/squad

Core conventions and patterns for this codebase

tiered-memory

1828
from bradygaster/squad

Three-tier agent memory model (hot/cold/wiki) for 20-55% context reduction per spawn

test-discipline

1828
from bradygaster/squad

Update tests when changing APIs — no exceptions

Skill: Retro Enforcement

1828
from bradygaster/squad

## Purpose

reflect

1828
from bradygaster/squad

Learning capture system that extracts HIGH/MED/LOW confidence patterns from conversations to prevent repeating mistakes. Use after user corrections ("no", "wrong"), praise ("perfect", "exactly"), or when discovering edge cases. Complements .squad/agents/{agent}/history.md and .squad/decisions.md.

notification-routing

1828
from bradygaster/squad

Route agent notifications to specific channels by type — prevent alert fatigue from single-channel flooding

iterative-retrieval

1828
from bradygaster/squad

Max-3-cycle protocol for agent sub-tasks with WHY context and coordinator validation. Use when spawning sub-agents to complete scoped work.

error-recovery

1828
from bradygaster/squad

Standard recovery patterns for all squad agents. When something fails, adapt — don't just report the failure.

docs-standards

1828
from bradygaster/squad

Microsoft Style Guide + Squad-specific documentation patterns

{skill-name}

1828
from bradygaster/squad

{what this skill teaches agents}