orchestrating-test-execution
Test coordinate parallel test execution across multiple environments and frameworks. Use when performing specialized testing. Trigger with phrases like "orchestrate tests", "run parallel tests", or "coordinate test execution".
Best use case
orchestrating-test-execution is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Test coordinate parallel test execution across multiple environments and frameworks. Use when performing specialized testing. Trigger with phrases like "orchestrate tests", "run parallel tests", or "coordinate test execution".
Teams using orchestrating-test-execution should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/orchestrating-test-execution/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How orchestrating-test-execution Compares
| Feature / Agent | orchestrating-test-execution | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Test coordinate parallel test execution across multiple environments and frameworks. Use when performing specialized testing. Trigger with phrases like "orchestrate tests", "run parallel tests", or "coordinate test execution".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# Test Orchestrator
## Overview
Coordinate parallel test execution across multiple test suites, frameworks, and environments. Manages test splitting, worker allocation, result aggregation, and intelligent retry strategies.
## Prerequisites
- Test runner with parallel execution support (Jest, Vitest, pytest-xdist, Playwright, or JUnit 5)
- CI/CD platform configured (GitHub Actions, GitLab CI, CircleCI, or Jenkins)
- Test suite with consistent pass rates (flaky tests identified and tagged)
- Sufficient CI runner resources for parallel worker count
- Test result reporting tool (JUnit XML, Allure, or equivalent)
## Instructions
1. Analyze the existing test suite using Grep and Glob to catalog all test files, their framework, approximate run time, and dependency requirements.
2. Classify tests into execution tiers:
- **Tier 1 (Fast)**: Unit tests with no I/O -- target under 30 seconds total.
- **Tier 2 (Medium)**: Integration tests requiring local services -- target under 3 minutes.
- **Tier 3 (Slow)**: E2E and browser tests -- target under 10 minutes.
3. Configure parallel execution for each tier:
- Split unit tests across N workers using `jest --shard=i/N` or `pytest -n auto`.
- Shard E2E tests by test file using Playwright `--shard=i/N` or Cypress parallelization.
- Assign heavier integration tests to dedicated workers with more resources.
4. Create a CI pipeline configuration that runs tiers in parallel:
- Tier 1 and Tier 2 run concurrently on separate jobs.
- Tier 3 runs after a fast pre-check gate passes.
- Each tier reports results to a unified collection step.
5. Implement intelligent retry logic for flaky tests:
- Tag known flaky tests with `@flaky` or equivalent marker.
- Retry failed tests up to 2 times before marking as failed.
- Track flaky test frequency in a log file for triage.
6. Aggregate results from all parallel workers into a single report:
- Merge JUnit XML files from each shard.
- Calculate total pass/fail/skip counts and execution time.
- Identify the slowest tests for optimization targets.
7. Write the orchestration configuration to the project's CI config file and validate it with a dry run.
## Output
- CI pipeline configuration file (`.github/workflows/test.yml`, `.gitlab-ci.yml`, or equivalent)
- Test sharding configuration with worker count and split strategy
- Merged test result report in JUnit XML or JSON format
- Execution timeline showing parallel job durations and bottlenecks
- Flaky test inventory with retry counts and failure patterns
## Error Handling
| Error | Cause | Solution |
|-------|-------|---------|
| Shard produces zero tests | Uneven test distribution or incorrect shard index | Verify shard count matches actual test file count; use file-based splitting |
| Worker out of memory | Too many parallel processes on one runner | Reduce `--maxWorkers` or `-n` count; increase runner memory; use `--workerIdleMemoryLimit` |
| Test ordering dependency | Tests pass in isolation but fail in specific shard order | Add `--randomize` flag; fix shared state leaks; enforce test independence |
| Result aggregation mismatch | Missing shard results due to job timeout | Set job-level timeouts higher than test timeouts; add result upload as a separate step |
| CI cache miss slowing startup | Dependencies not cached between parallel jobs | Configure dependency caching per lockfile hash; use a shared setup job |
## Examples
**GitHub Actions matrix strategy for Jest sharding:**
```yaml
jobs:
test:
strategy:
matrix:
shard: [1, 2, 3, 4]
steps:
- run: npx jest --shard=${{ matrix.shard }}/4 --ci --reporters=jest-junit
- uses: actions/upload-artifact@v4
with:
name: results-${{ matrix.shard }}
path: junit.xml
merge:
needs: test
steps:
- uses: actions/download-artifact@v4
- run: npx junit-merge -d results-* -o merged-results.xml
```
**pytest-xdist parallel execution:**
```bash
pytest -n auto --dist worksteal -q --junitxml=results.xml
```
**Playwright sharded execution:**
```bash
npx playwright test --shard=1/3 --reporter=junit
```
## Resources
- Jest sharding: https://jestjs.io/docs/cli#--shardshardindex-shardcount
- pytest-xdist: https://pytest-xdist.readthedocs.io/
- Playwright test sharding: https://playwright.dev/docs/test-sharding
- GitHub Actions matrix strategy: https://docs.github.com/en/actions/using-jobs/using-a-matrix-for-your-jobs
- JUnit XML merge tools: https://github.com/imsky/junit-mergeRelated Skills
test-skill
Test skill for E2E validation. Trigger with "run test skill" or "execute test". Use this skill when testing skill activation and tool permissions.
testing-visual-regression
Detect visual changes in UI components using screenshot comparison. Use when detecting unintended UI changes or pixel differences. Trigger with phrases like "test visual changes", "compare screenshots", or "detect UI regressions".
generating-unit-tests
Test automatically generate comprehensive unit tests from source code covering happy paths, edge cases, and error conditions. Use when creating test coverage for functions, classes, or modules. Trigger with phrases like "generate unit tests", "create tests for", or "add test coverage".
generating-test-reports
Generate comprehensive test reports with metrics, coverage, and visualizations. Use when performing specialized testing. Trigger with phrases like "generate test report", "create test documentation", or "show test metrics".
managing-test-environments
Test provision and manage isolated test environments with configuration and data. Use when performing specialized testing. Trigger with phrases like "manage test environment", "provision test env", or "setup test infrastructure".
generating-test-doubles
Generate mocks, stubs, spies, and fakes for dependency isolation. Use when creating mocks, stubs, or test isolation fixtures. Trigger with phrases like "generate mocks", "create test doubles", or "setup stubs".
generating-test-data
Generate realistic test data including edge cases and boundary conditions. Use when creating realistic fixtures or edge case test data. Trigger with phrases like "generate test data", "create fixtures", or "setup test database".
analyzing-test-coverage
Analyze code coverage metrics and identify untested code paths. Use when analyzing untested code or coverage gaps. Trigger with phrases like "analyze coverage", "check test coverage", or "find untested code".
managing-snapshot-tests
Create and validate component snapshots for UI regression testing. Use when performing specialized testing. Trigger with phrases like "update snapshots", "test UI snapshots", or "validate component snapshots".
running-smoke-tests
Execute fast smoke tests validating critical functionality after deployment. Use when performing specialized testing. Trigger with phrases like "run smoke tests", "quick validation", or "test critical paths".
performing-security-testing
Test automate security vulnerability testing covering OWASP Top 10, SQL injection, XSS, CSRF, and authentication issues. Use when performing security assessments, penetration tests, or vulnerability scans. Trigger with phrases like "scan for vulnerabilities", "test security", or "run penetration test".
tracking-regression-tests
Track and manage regression test suites across releases. Use when performing specialized testing. Trigger with phrases like "track regressions", "manage regression suite", or "validate against baseline".