test

Comprehensive testing workflow - unit tests ∥ integration tests → E2E tests

422 stars

byvibeeval

View on GitHub Installation ↓

Best use case

test is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Comprehensive testing workflow - unit tests ∥ integration tests → E2E tests

Teams using test should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/test/SKILL.md --create-dirs "https://raw.githubusercontent.com/vibeeval/vibecosystem/main/skills/test/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/test/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How test Compares

Feature / Agent	test	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Comprehensive testing workflow - unit tests ∥ integration tests → E2E tests

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# /test - Testing Workflow

Run comprehensive test suite with parallel execution.

## When to Use

- "Run all tests"
- "Test the feature"
- "Verify everything works"
- "Full test suite"
- Before releases or merges
- After major changes

## Workflow Overview

```
┌─────────────┐      ┌───────────┐
│ diagnostics │ ──▶  │ arbiter  │ ─┐
│ (type check)│      │ (unit)    │  │
└─────────────┘      └───────────┘  │
                                    ├──▶ ┌─────────┐
                     ┌───────────┐  │    │  atlas  │
                     │  arbiter  │ ─┘    │ (e2e)   │
                     │ (integ)   │       └─────────┘
                     └───────────┘

  Pre-flight         Parallel              Sequential
  (~1 second)        fast tests            slow tests
```

## Agent Sequence

| # | Agent | Role | Execution |
|---|-------|------|-----------|
| 1 | **arbiter** | Unit tests, type checks, linting | Parallel |
| 1 | **arbiter** | Integration tests | Parallel |
| 2 | **atlas** | E2E/acceptance tests | After 1 passes |

## Why This Order?

1. **Fast feedback**: Unit tests fail fast
2. **Parallel efficiency**: No dependency between unit and integration
3. **E2E gating**: Only run slow E2E tests if faster tests pass

## Execution

### Phase 0: Pre-flight Diagnostics (NEW)

Before running tests, check for type errors - they often cause test failures:

```bash
tldr diagnostics . --project --format text 2>/dev/null | grep "^E " | head -10
```

**Why diagnostics first?**
- Type check is instant (~1s), tests take longer
- Diagnostics show ROOT CAUSE, tests show symptoms
- "Expected int, got str" is clearer than "AttributeError at line 50"
- Catches errors in untested code paths

**If errors found:** Fix them BEFORE running tests. Type errors usually mean tests will fail anyway.

**If clean:** Proceed to Phase 1.

### Phase 0.5: Change Impact (Optional)

For large test suites, find only affected tests:

```bash
tldr change-impact --session
# or for explicit files:
tldr change-impact src/changed_file.py
```

This returns which tests to run based on what changed. Skip this for small projects or when you want full coverage.

### Phase 1: Parallel Tests

```
# Run both in parallel
Task(
  subagent_type="arbiter",
  prompt="""
  Run unit tests for: [SCOPE]

  Include:
  - Unit tests
  - Type checking
  - Linting

  Report: Pass/fail count, failures detail
  """,
  run_in_background=true
)

Task(
  subagent_type="arbiter",
  prompt="""
  Run integration tests for: [SCOPE]

  Include:
  - Integration tests
  - API tests
  - Database tests

  Report: Pass/fail count, failures detail
  """,
  run_in_background=true
)

# Wait for both
[Check TaskOutput for both]
```

### Phase 2: E2E Tests (If Phase 1 Passes)

```
Task(
  subagent_type="atlas",
  prompt="""
  Run E2E tests for: [SCOPE]

  Include:
  - End-to-end flows
  - Acceptance tests
  - UI tests if applicable

  Report: Pass/fail count, screenshots on failure
  """
)
```

## Test Scopes

### Full Suite
```
User: /test
→ All unit + integration + E2E tests
```

### Feature Scope
```
User: /test authentication
→ Only auth-related tests
```

### Quick Check
```
User: /test --quick
→ Only unit tests (skip integration and E2E)
```

## Example

```
User: /test the new payment feature

Claude: Starting /test workflow for payment feature...

Phase 0: Pre-flight diagnostics...
$ tldr diagnostics . --project --format text | grep "^E "
(no type errors found)

Phase 1: Running parallel tests...
┌─────────────────────────────────────────┐
│ arbiter: Running unit tests...          │
│ arbiter: Running integration tests...   │
└─────────────────────────────────────────┘

arbiter: ✅ 45/45 unit tests passing
arbiter: ✅ 12/12 integration tests passing

Phase 2: Running E2E tests...
atlas: ✅ 8/8 E2E tests passing

Test Summary:
┌─────────────┬─────────┬────────┐
│ Type        │ Passed  │ Failed │
├─────────────┼─────────┼────────┤
│ Unit        │ 45      │ 0      │
│ Integration │ 12      │ 0      │
│ E2E         │ 8       │ 0      │
├─────────────┼─────────┼────────┤
│ TOTAL       │ 65      │ 0      │
└─────────────┴─────────┴────────┘

All tests passing! ✅
```

### Example with Type Errors

```
User: /test

Claude: Starting /test workflow...

Phase 0: Pre-flight diagnostics...
$ tldr diagnostics . --project --format text | grep "^E "
E src/payment.py:45:12: Argument of type 'str' not assignable to 'int'
E src/refund.py:23:8: Return type 'None' not assignable to 'float'

Found 2 type errors. Fixing before running tests...

[Claude fixes the type errors]

Re-running diagnostics... clean.

Phase 1: Running parallel tests...
```

## Failure Handling

If Phase 1 fails:
```
arbiter: ❌ 43/45 tests passing

2 failures:
- test_payment_validation: expected 'invalid' got 'valid'
- test_refund_calculation: off by $0.01

Stopping workflow. Fix failures before running E2E tests.
```

## Flags

- `--quick`: Unit tests only
- `--no-e2e`: Skip E2E tests
- `--coverage`: Include coverage report
- `--watch`: Re-run on file changes

Related Skills

test-strategy

422

from vibeeval/vibecosystem

Test pyramid decision matrix, coverage targets, when to write which test type, mock vs real dependency decisions, and test ROI analysis.

python-testing

422

from vibeeval/vibecosystem

Python testing strategies using pytest, TDD methodology, fixtures, mocking, parametrization, and coverage requirements.

property-based-testing

422

from vibeeval/vibecosystem

Property-based testing (PBT) patterns with fast-check (JS/TS), Hypothesis (Python), and gopter (Go). Generate random inputs, define invariants, shrink failures to minimal cases. Adapted from Trail of Bits. Use when testing pure functions, parsers, serializers, state machines, or any code where example-based tests miss edge cases.

performance-testing

422

from vibeeval/vibecosystem

Load testing with k6/Artillery, response time thresholds, memory leak detection, N+1 query detection, and CI integration.

load-testing-patterns

422

from vibeeval/vibecosystem

k6 script templates, load profiles, response time thresholds, SLO validation, and performance testing strategies.

golang-testing

422

from vibeeval/vibecosystem

Go testing patterns including table-driven tests, subtests, benchmarks, fuzzing, and test coverage. Follows TDD methodology with idiomatic Go practices.

contract-testing-patterns

422

from vibeeval/vibecosystem

Pact consumer-driven contracts, provider verification, schema evolution

accessibility-testing

422

from vibeeval/vibecosystem

axe-core integration, WCAG 2.2 AA checklist, keyboard navigation testing, screen reader testing, and ARIA pattern validation.

workflow-router

422

from vibeeval/vibecosystem

Goal-based workflow orchestration - routes tasks to specialist agents based on user goals

wiring

422

from vibeeval/vibecosystem

Wiring Verification

websocket-patterns

422

from vibeeval/vibecosystem

Connection management, room patterns, reconnection strategies, message buffering, and binary protocol design.

visual-verdict

422

from vibeeval/vibecosystem

Screenshot comparison QA for frontend development. Takes a screenshot of the current implementation, scores it across multiple visual dimensions, and returns a structured PASS/REVISE/FAIL verdict with concrete fixes. Use when implementing UI from a design reference or verifying visual correctness.