gh-actions-wisdom
GitHub Actions workflow best practices and pitfalls reference. Use when: (1) Writing or reviewing .yml workflows, (2) Setting up CI/CD pipelines, (3) Debugging slow, expensive, or stuck workflow runs, (4) User says 'gh actions', 'github actions', 'workflow best practices', (5) Before creating or modifying any .github/workflows/ file. Keywords: GitHub Actions, CI/CD, workflow, timeout, concurrency, security, caching.
Best use case
gh-actions-wisdom is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
GitHub Actions workflow best practices and pitfalls reference. Use when: (1) Writing or reviewing .yml workflows, (2) Setting up CI/CD pipelines, (3) Debugging slow, expensive, or stuck workflow runs, (4) User says 'gh actions', 'github actions', 'workflow best practices', (5) Before creating or modifying any .github/workflows/ file. Keywords: GitHub Actions, CI/CD, workflow, timeout, concurrency, security, caching.
Teams using gh-actions-wisdom should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/gh-actions-wisdom/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How gh-actions-wisdom Compares
| Feature / Agent | gh-actions-wisdom | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
GitHub Actions workflow best practices and pitfalls reference. Use when: (1) Writing or reviewing .yml workflows, (2) Setting up CI/CD pipelines, (3) Debugging slow, expensive, or stuck workflow runs, (4) User says 'gh actions', 'github actions', 'workflow best practices', (5) Before creating or modifying any .github/workflows/ file. Keywords: GitHub Actions, CI/CD, workflow, timeout, concurrency, security, caching.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# GitHub Actions Wisdom
Reference best practices before writing or reviewing any GitHub Actions workflow.
Load topic-specific references as needed from `references/`.
## Runner Context Matters
Several rules below depend on whether your jobs run on **ephemeral cloud runners** (GitHub-hosted `ubuntu-latest`, RunsOn, BuildJet, Namespace, etc. — fresh VM per job, wiped between runs) or **persistent self-hosted runners** (long-lived machines with state that carries across runs). Advice that is correct in one context can be a hard-to-debug bug in the other.
| Concern | Ephemeral cloud runners | Persistent self-hosted runners |
| ---------------------------------- | ---------------------------------------------------- | ----------------------------------------------------------- |
| `actions/cache` for build tools | **Use it** — disk is wiped between runs | Avoid — local disk is already the cache |
| `set-safe-directory: false` | **Don't set** — containers need the default | Set it — avoids `~/.gitconfig` pollution |
| Manual workspace cleanup steps | Not needed — fresh VM each run | Often needed — workspace persists |
| `chown` workspace at job end | Not needed — VM is destroyed | Sometimes needed for next-run access |
| `detect-runner` fallback pattern | Obsolete — the cloud runner IS the runner | Useful when mixing self-hosted + GitHub-hosted |
**Migration warning.** When moving a workflow from self-hosted to ephemeral (or vice versa), audit every step and option that was added "for the runner". Leftover self-hosted-isms on a cloud runner produce mysterious failures: `pnpm: command not found` (no setup step because pnpm was preinstalled), `Cache not found` between jobs (cache backend differs), `fatal: detected dubious ownership` (because `set-safe-directory: false` is now actively wrong), etc. Specific rules below are gated by runner context where it matters.
## Critical Rules (Always Apply)
### 1. Always Set `timeout-minutes`
The default timeout is **360 minutes (6 hours)**. A stuck job silently burns runner minutes.
```yaml
jobs:
build:
runs-on: ubuntu-latest
timeout-minutes: 15 # ALWAYS set this
```
Recommended values:
| Job type | timeout-minutes |
| ---------------- | --------------- |
| Lint / typecheck | 5-10 |
| Unit tests | 10-15 |
| Build | 15-30 |
| E2E tests | 30-60 |
| Docker build | 15-30 |
| Deploy | 10-15 |
| Notification | 5 |
### 2. Always Set Concurrency Control
Prevent redundant runs and protect production deploys.
```yaml
# PR checks: cancel previous runs on new push
concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true
# Production deploy: never cancel in-progress
concurrency:
group: deploy-production
cancel-in-progress: false
```
### 3. Always Declare Permissions
Never rely on default permissions. Declare explicitly per workflow or per job.
```yaml
permissions:
contents: read
jobs:
deploy:
permissions:
contents: read
deployments: write
```
### 4. Pin Actions to Full SHA
Tags are mutable. The March 2025 `tj-actions/changed-files` supply chain attack (CVE-2025-30066) compromised 23,000+ repos via rewritten tags.
```yaml
# Bad - tag can be rewritten
- uses: actions/checkout@v4
# Good - immutable SHA
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
```
**Caveat**: Some repos (e.g., `pnpm/action-setup`) have force-pushed, invalidating previously pinned SHAs. If CI fails with `Unable to resolve action ... unable to find version`, look up the current SHA via `gh api repos/OWNER/REPO/git/ref/tags/vX.Y.Z`. See [references/security.md](references/security.md) for the full diagnostic procedure.
### 5. Do NOT Cache Package Managers (pnpm/npm/yarn)
Do **not** use `cache: 'pnpm'` (or `cache: 'npm'`, `cache: 'yarn'`) in `actions/setup-node`. GitHub Actions cache restore is often **slower** than a fresh `pnpm install` from npm's CDN. npm's CDN is highly optimized for package downloads, while GitHub's cache API has significant overhead for large stores (especially 1GB+). Benchmarking confirmed: direct install from CDN consistently beats cache restore + install.
```yaml
# BAD - cache restore adds overhead, slower than fresh install
- uses: actions/setup-node@v4
with:
node-version-file: .node-version
cache: pnpm # REMOVE THIS
# GOOD - just install directly
- uses: actions/setup-node@v4
with:
node-version-file: .node-version
- run: pnpm install
```
This is especially true for **self-hosted runners** where the pnpm store is already local — caching to GitHub's remote cache and restoring it is pointless overhead.
### 6. `set-safe-directory`: leave default on ephemeral runners, set `false` on self-hosted
`actions/checkout` defaults `set-safe-directory` to `true`, which runs `git config --global --add safe.directory` on every run.
**Ephemeral cloud runners** — leave the default (`true`). Each run is a fresh VM, so there is no gitconfig to pollute. The default is also required for container jobs whose UID differs from the host runner user; without it, git inside the container errors with `fatal: detected dubious ownership` when it tries to operate on the mounted workspace.
```yaml
# GOOD on ephemeral runners — let checkout do its default thing
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
```
**Persistent self-hosted runners** — set it to `false`. Otherwise `~/.gitconfig` accumulates a duplicate `safe.directory` entry on every run, polluting the shared gitconfig across every repo on that machine.
```yaml
# GOOD on self-hosted runners — prevent gitconfig pollution
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
with:
set-safe-directory: false
```
When migrating self-hosted → ephemeral, **forgetting to remove `set-safe-directory: false`** is a common gotcha. Non-container jobs may still work (the runner user owns the workspace), but the moment a job runs in a container, git inside hits dubious-ownership and fails with confusing errors.
#### Container jobs need an extra manual step (regardless of runner type)
`actions/checkout` (a node action) writes safe.directory to the node-action HOME (`/root/.gitconfig` inside many containers). Shell `run:` steps inside the container have a different HOME (`/github/home`), so they read a different gitconfig and don't see the safe.directory entry. Lifecycle scripts (`pnpm install` calling `prepare` → `lefthook install` → git) then fail with `fatal: detected dubious ownership`.
For **container jobs**, add a manual step before checkout that writes safe.directory to the shell-side gitconfig:
```yaml
test:
runs-on: ubuntu-latest
container:
image: foo:bar
steps:
- name: Mark workspace as safe for git
run: git config --global --add safe.directory "$GITHUB_WORKSPACE"
- uses: actions/checkout@v4
# ... rest of the job
```
This is orthogonal to the `set-safe-directory` option — it covers shell-step git invocations, which checkout's option doesn't reliably reach in container jobs. Plain (non-container) jobs do not need it.
### 7. `actions/cache` for build tools: yes on ephemeral, no on self-hosted
**Persistent self-hosted runners** — build tool caches (Cargo, Go modules, Gradle, etc.) already persist on the runner's local disk. Using `actions/cache` uploads them to GitHub's remote cache API on every run and creates duplicate entries, wasting storage.
```yaml
# BAD on self-hosted — uploads local cache to remote on every run
- uses: actions/cache@v4
with:
path: ~/.cargo/registry
key: cargo-${{ hashFiles('Cargo.lock') }}
# GOOD on self-hosted — just use the local disk cache directly
# (no actions/cache step needed)
```
**Ephemeral cloud runners** — disk is wiped between runs, so `actions/cache` is essential to avoid re-downloading the dependency tree from scratch every time. Use it for `~/.cargo/registry`, `~/.gradle/caches`, the Go module cache, etc.
```yaml
# GOOD on ephemeral runners — survives across runs
- uses: actions/cache@v4
with:
path: ~/.cargo/registry
key: cargo-${{ runner.os }}-${{ hashFiles('Cargo.lock') }}
```
Note: rule 5 ("Don't cache package managers in `setup-node`") still applies on both runner types — that rule is about npm package downloads where the CDN is faster than cache restore. Rule 7 is about general build-tool caches.
### 8. Avoid `curl | sh` Installers — Use Prebuilt-Binary Actions
Installer scripts like `curl https://.../init.sh | sh` (wasm-pack, rustup, many language toolchains) do **one** HTTP request with no retry. A single transient 5xx from the redirect target (e.g., a GitHub release asset) kills the entire workflow. Seen in the wild: rustwasm.github.io → `github.com/rustwasm/wasm-pack/releases/...` returning 504 mid-deploy.
```yaml
# BAD — one curl, no retry, fails on any 5xx
- name: Install wasm-pack
run: curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
# GOOD — prebuilt binary from GitHub releases, with retries + runner caching
- uses: taiki-e/install-action@v2
with:
tool: wasm-pack
```
`taiki-e/install-action` covers most Rust/Go/Node tools (`wasm-pack`, `cargo-nextest`, `just`, `mdbook`, etc.). For tools it doesn't cover, use `actions/cache` on a pinned-version binary, or wrap the curl in a retry loop with `curl --retry 5 --retry-all-errors --retry-delay 5`.
### 9. Use Cache (Not Artifacts) for Inter-Job Data Sharing
`upload-artifact`/`download-artifact` counts toward **shared org storage** (often limited). For passing build output between jobs in the same workflow, use `actions/cache` instead — it has a **separate 10 GB per-repo limit**.
```yaml
# BAD — artifacts accumulate in shared org storage
- uses: actions/upload-artifact@v4
with:
name: build-output
path: dist/
retention-days: 1
# GOOD — cache uses separate per-repo quota
- uses: actions/cache/save@v4
with:
path: dist/
key: build-${{ github.run_id }}
# In the downstream job:
- uses: actions/cache/restore@v4
with:
path: dist/
key: build-${{ github.run_id }}
```
If the build and deploy steps can run on the same runner, merging them into a single job is even simpler.
**Caveat for cloud runners that proxy the cache layer** (e.g., RunsOn with `extras=s3-cache+magic-cache`) — the runner injects a sidecar at `ACTIONS_RESULTS_URL` that intercepts both v2 cache and v4 artifact API calls. The sidecar speaks the cache protocol but **does not always speak the v4 artifact protocol**. With magic-cache enabled, `actions/upload-artifact@v4` may fail with `Unexpected token '...' is not valid JSON` because the sidecar returns plain-text errors for artifact endpoints.
If you hit this, the symptoms vary by transport:
- **Cache-based passing across instances**: works only if the sidecar is reachable. From inside a container job whose docker network is isolated from the runner host, the sidecar's host IP is unreachable → `Cache not found` even when the upstream job successfully saved.
- **Artifact-based passing**: works if you remove the proxy interception (drop `magic-cache` from the `runs-on` label) so v4 artifact calls reach `api.github.com` directly.
When in doubt on a cloud runner that proxies caching, prefer `upload-artifact`/`download-artifact` over `actions/cache` and disable any cache-proxy extras. Artifacts go straight to the GitHub API which is reachable from any container or instance.
## Quick Reference by Topic
For detailed guidance, read the appropriate reference file:
- **Timeouts and resource limits**: See [references/timeouts.md](references/timeouts.md)
- **Security**: See [references/security.md](references/security.md) - action pinning, `pull_request_target`, script injection, secrets, OIDC
- **Performance**: See [references/performance.md](references/performance.md) - caching, path filters, matrix, parallelization
- **Reliability**: See [references/reliability.md](references/reliability.md) - retries, error handling, conditional execution
- **Anti-patterns**: See [references/anti-patterns.md](references/anti-patterns.md) - common foot guns and how to avoid them
- **Workflow organization**: See [references/organization.md](references/organization.md) - reusable workflows, composite actions, splitting strategies
## Debugging: Local First, Push Second
**Never debug CI issues by pushing and waiting.** CI runs consume time (10-15 min per cycle) and runner minutes. Always verify locally first:
```bash
# Run the same checks CI runs, locally
pnpm check # typecheck + lint + format
pnpm build # production build
pnpm test # unit tests
# Only after ALL pass locally:
git push
# Then monitor:
/watch-ci
```
**The workflow**: fix locally → verify locally → push once → `/watch-ci`. If CI fails after local verification, it's either an environment difference (Node version, missing env vars) or a path/dependency issue specific to CI — much easier to diagnose than a code bug.
## Workflow Review Checklist
When reviewing or writing a workflow, verify:
1. Every job has `timeout-minutes`
2. `concurrency` group is set with appropriate `cancel-in-progress`
3. `permissions` are declared (least privilege)
4. Third-party actions pinned to SHA with version comment
5. `pull_request_target` is NOT used with PR code checkout
6. No string interpolation of user-controlled values in `run:` blocks
7. Secrets passed individually, not via `secrets: inherit`
8. No `cache:` parameter in `setup-node` (fresh install from CDN is faster — see rule 5)
9. Path filters used where possible to skip irrelevant runs
10. Deploy steps have retry logic for network operations
11. `actions/checkout` matches the runner type — default on ephemeral, `set-safe-directory: false` on self-hosted only (see rule 6)
12. `actions/cache` for build tools matches the runner type — used on ephemeral, NOT on self-hosted (see rule 7)
13. No `curl | sh` installers — use `taiki-e/install-action` or similar with retries (see rule 8)
14. Inter-job data sharing uses `actions/cache` not `upload-artifact` to avoid org storage limits — but switch to artifacts when a cloud runner's cache-proxy sidecar (e.g. RunsOn `magic-cache`) breaks v4 caching from container jobs (see rule 9)
15. When migrating between self-hosted and ephemeral runners, audit every step for runner-type-specific options that may now be wrong (see "Runner Context Matters")Related Skills
prototype-first-wisdom
Solve a complex bug or design problem by building a tiny isolated prototype first, instead of patching the production system in place. Trigger PROACTIVELY when (1) the same bug has resisted 2+ in-place fix attempts (fail-retry loop), (2) the user mentions "minimal prototype", "from zero", "from scratch", "simple script", "sandbox", "standalone", "isolate", "play around", or "try a sandbox version", (3) you find yourself ranking a list of suspects and ruling them out via source-grep on a runtime/visual bug, (4) the user is brainstorming many design options for a UI surface and wants speed (e.g., "make 20 patterns of the top page"), (5) the next reasonable step would be "instrument the existing complex code" — pause and consider this skill instead. Build the prototype in the repo-scoped Dropbox-synced cclogs dir (`$DROPBOX_CCLOGS_DIR/<repo>/<descriptive-name>/`) so it survives switching between Mac and WSL; the exception is a prototype that must import the repo's production code or use its workspace/Vite tooling — keep that one in `__inbox/<descriptive-name>/` in the project root (in-repo, gitignored) so relative imports resolve. Match the project's tech stack (HTML+CSS+vanilla JS for static sites, Vite+React for React apps, Node script for CLI/utility logic). Don't commit it — its value is the learning, not the artifact. **Variant for repeated regression cycles (8+ in-place fixes on the same bug class):** keep the prototype as a committed sub-package named `packages/prototype-<topic>/` — see the "Variant: project-level reference prototype" section below.
dev-gh-actions-doc-auto-merge
Create a GitHub Actions workflow that auto-merges a production branch into a documentation branch. Use when: (1) Setting up auto-sync from production to doc branch, (2) User mentions 'doc auto merge', 'auto sync docs', 'document branch sync', (3) User wants docs to stay up-to-date with production automatically.
dev-actions-self-runner
Add self-hosted runner support with automatic fallback to GitHub-hosted runners in GitHub Actions workflows. Use when: (1) User wants to add self-hosted runner support, (2) User says 'self-hosted runner', 'add self runner', 'self-hosted fallback', (3) User wants to save GitHub Actions minutes.
b4push-wisdom
Guide for setting up before-push validation (b4push) and CI checking. Covers analyzing project structure, creating run-b4push.sh, adding package.json entry, creating project-specific b4push skill, setting up GitHub Actions CI. Use when: (1) User says 'set up b4push', 'add CI', 'before push checks', (2) Setting up a new project's validation workflow, (3) User wants CI + local validation.
zudoesa-articlify
Convert conversation context into an esa article via the zudoesa-writer subagent. ONLY invoke when the user explicitly asks — NEVER proactively propose. Triggers: 'write esa article', 'esa記事', 'esaに書いて', 'articlify for esa', or /zudoesa-articlify. Gathers context, creates a writing brief, delegates to the writer subagent.
zudoesa-apply-voice
Apply Takazudo's esa writing voice and vocabulary rules to text. Use when: (1) User wants to write/rewrite text in Takazudo's esa style, (2) User says 'apply voice', 'esa voice', 'esa文体で', 'esa風に書いて', '文体を適用', (3) User provides text to transform to esa style. Reads writing-style.md and vocabulary-rule.md from takazudo-esa-writing repo and applies the rules.
zudocg-articlify
Convert conversation context into a CodeGrid article via the zudocg-writer subagent. ONLY invoke when the user explicitly asks — NEVER proactively propose. Triggers: 'write codegrid article', 'CodeGrid記事', 'codegridに書いて', 'articlify for codegrid', or /zudocg-articlify. Gathers context, creates a writing brief, delegates to the writer subagent.
zudocg-apply-voice
Apply Takazudo's CodeGrid writing voice and vocabulary rules to text. Use when: (1) User wants to write/rewrite text in Takazudo's CodeGrid style, (2) User says 'apply voice', 'codegrid voice', 'codegrid文体で', 'codegrid風に書いて', '文体を適用', (3) User provides text to transform to CodeGrid style. Reads writing-style.md and vocabulary-rule.md from takazudo-codegrid-writing repo and applies the rules.
zpaper-articlify
Convert conversation context into a zpaper blog article via the zpaper-writer subagent. ONLY invoke when the user explicitly asks — NEVER proactively propose. Triggers: 'write zpaper article', 'zpaper記事', 'zpaperに書いて', 'articlify for zpaper', or /zpaper-articlify. Gathers context, creates a writing brief, delegates to the writer subagent.
zpaper-apply-voice
Apply Takazudo's zpaper blog writing voice and vocabulary rules to text. Use when: (1) User wants to write/rewrite text in Takazudo's zpaper style, (2) User says 'apply voice', 'zpaper voice', 'zpaper文体で', 'zpaper風に書いて', 'ブログ文体を適用', (3) User provides text to transform to zpaper style. Reads writing-style.md and vocabulary-rule.md from the zpaper repo and applies the rules.
xlsx
Spreadsheet creation, editing, and analysis. Use when working with .xlsx, .xlsm, .csv, .tsv files for: (1) Creating spreadsheets with formulas and formatting, (2) Reading or analyzing data, (3) Modifying existing spreadsheets while preserving formulas, (4) Data analysis and visualization, (5) Recalculating formulas.
x
Facade for development workflows. Routes on two axes: plan-first vs implement-now (escalates to /big-plan -a when the request needs research / decomposition / has unclear scope — the appended -a makes the plan chain into implementation in-session), then single vs multi on the ready-to-build fast paths (/x-as-pr single-topic, /x-wt-teams multi-topic parallel). Use when: (1) User says '/x' followed by dev instructions, (2) User wants to start development without choosing the workflow skill, (3) User says 'dev', 'implement', or 'build' with a task. Default option: -v (verify-ui). Review-loop (-l) is opt-in — without -l the downstream skill runs a single /deep-review pass. Forwards -a (autonomy/auto-chain) and -m (merge at the end + cleanup + CI watch) through every route; auto-fix of raised findings (-f) and issue-raising (-ri) are downstream defaults, with -nf/--no-fix and -nori/--no-raise-issues as the forwarded opt-outs. -a and -m are orthogonal — full hands-off end-to-end is -a -m.