test

Test planning, generation, and execution -- unit, integration, and end-to-end testing workflows

39 stars

Best use case

test is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Test planning, generation, and execution -- unit, integration, and end-to-end testing workflows

Teams using test should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/test/SKILL.md --create-dirs "https://raw.githubusercontent.com/InugamiDev/ultrathink-oss/main/.claude/skills/test/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/test/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How test Compares

Feature / AgenttestStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Test planning, generation, and execution -- unit, integration, and end-to-end testing workflows

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Test Skill

## Purpose

Plan, generate, and execute tests that verify code correctness, prevent regressions, and document expected behavior. This skill covers the full testing lifecycle from strategy through execution.

Tests are not bureaucracy. Tests are executable documentation that proves your code works.

## Workflow

### Mode 1: Test Planning

Use when you need to decide WHAT to test before writing tests.

1. **Identify the testing target** -- What code/feature needs tests?
2. **Read the target code** -- Understand what it does, its inputs, outputs, and edge cases.
3. **Detect existing test infrastructure**
   - What test framework is in use? (Jest, Vitest, pytest, Go testing, etc.)
   - Where do tests live? (co-located, separate `tests/` directory, `__tests__` folders)
   - What testing utilities exist? (test helpers, factories, mocks, fixtures)
   - What is the test runner command?
4. **Categorize test needs**:
   - **Unit tests**: Individual functions/methods in isolation
   - **Integration tests**: Multiple components working together
   - **E2E tests**: Full user flows through the system
5. **Produce the test plan**:

```markdown
# Test Plan: [Target]

## Target
[What is being tested]

## Test Infrastructure
- **Framework**: [name + version]
- **Runner command**: [command]
- **Test location**: [where tests should go]

## Unit Tests
| Test | Input | Expected Output | Edge Case? |
|------|-------|----------------|------------|
| [function] handles valid input | [input] | [output] | No |
| [function] handles empty input | [] | [] or error | Yes |
| [function] handles null | null | throws TypeError | Yes |

## Integration Tests
| Test | Components | Scenario |
|------|-----------|----------|
| [feature] happy path | A + B + C | [description] |
| [feature] error propagation | A + B | [description] |

## Edge Cases to Cover
- [edge case 1]
- [edge case 2]

## Not Testing (and why)
- [thing]: [reason -- e.g., "covered by upstream tests"]
```

### Mode 2: Test Generation

Use when you need to WRITE tests.

1. **Read the target code** thoroughly.
2. **Read existing tests** in the same area to match conventions.
3. **Generate tests** following these principles:
   - **Arrange-Act-Assert** (AAA) pattern for every test
   - **One assertion per test** (prefer focused tests over multi-assert tests)
   - **Descriptive test names** that read as specifications: `it('returns 404 when user does not exist')`
   - **Test behavior, not implementation** -- tests should not break when internals are refactored
   - **Cover the testing triangle**:
     - Happy path (normal correct behavior)
     - Edge cases (empty inputs, boundaries, max values)
     - Error cases (invalid inputs, network failures, permission errors)
     - Null/undefined handling

4. **Test file structure**:
```typescript
describe('[ModuleName]', () => {
  describe('[functionName]', () => {
    // Setup shared across this function's tests
    beforeEach(() => { /* ... */ });

    it('does X when given Y', () => {
      // Arrange
      const input = createTestInput();

      // Act
      const result = functionName(input);

      // Assert
      expect(result).toEqual(expectedOutput);
    });

    it('throws when given invalid input', () => {
      expect(() => functionName(null)).toThrow(ValidationError);
    });

    it('handles edge case: empty array', () => {
      const result = functionName([]);
      expect(result).toEqual([]);
    });
  });
});
```

5. **Mocking strategy**:
   - Mock external dependencies (APIs, databases, file system)
   - Do NOT mock the code under test
   - Prefer dependency injection over module mocking where possible
   - Use factory functions for test data, not inline literals
   - Reset mocks between tests

6. **Write the test file** using the Write tool (new file) or Edit tool (adding to existing).

### Mode 3: Test Execution

Use when you need to RUN tests and interpret results.

1. **Determine the test command** from project configuration.
2. **Run tests**:
   - Specific test file: `npm test -- path/to/test.ts`
   - Specific test: `npm test -- --testNamePattern="test name"`
   - Full suite: `npm test`
   - With coverage: `npm test -- --coverage`
3. **Interpret results**:
   - **All pass**: Report success with summary
   - **Failures**: For each failure:
     - What test failed?
     - What was expected vs. actual?
     - Is this a test bug or a code bug?
   - **Errors**: Test infrastructure problems (missing deps, config issues)
4. **If tests fail**:
   - Determine if the test is correct and the code is wrong (hand off to `fix`)
   - Or if the code is correct and the test needs updating (fix the test)
   - Or if there is a test infrastructure issue (fix the setup)

### Mode 4: Coverage Analysis

1. **Run with coverage** enabled.
2. **Identify uncovered areas**:
   - Uncovered lines/branches in the target code
   - Missing edge case coverage
   - Untested error paths
3. **Prioritize coverage gaps** by risk:
   - High risk: Error handling, security-related, data mutation
   - Medium risk: Business logic branches, edge cases
   - Low risk: Logging, formatting, display-only code
4. **Generate additional tests** for high-priority gaps.

## Usage

### Plan tests
```
/test Plan tests for the authentication module
```

### Write tests
```
/test Write unit tests for src/lib/utils/formatDate.ts
```

### Run tests
```
/test Run the test suite and report results
```

### After a fix
```
/test Write a regression test for the bug fixed in src/lib/orders.ts
```

### Coverage improvement
```
/test Improve test coverage for the payment processing module
```

## Examples

### Example: Unit test generation

**Target**: `formatCurrency(amount: number, currency: string): string`

**Generated tests**:
- `formatCurrency(10.5, 'USD')` returns `'$10.50'`
- `formatCurrency(0, 'USD')` returns `'$0.00'`
- `formatCurrency(-5, 'USD')` returns `'-$5.00'`
- `formatCurrency(1000000, 'USD')` returns `'$1,000,000.00'`
- `formatCurrency(10.5, 'EUR')` returns `'EUR10.50'` (or locale-appropriate)
- `formatCurrency(10.5, 'INVALID')` throws `UnsupportedCurrencyError`
- `formatCurrency(NaN, 'USD')` throws `InvalidAmountError`
- `formatCurrency(Infinity, 'USD')` throws `InvalidAmountError`

### Example: Integration test

**Target**: User registration flow (API endpoint + database + email)

**Generated tests**:
- POST `/api/register` with valid data creates user and sends welcome email
- POST `/api/register` with existing email returns 409 Conflict
- POST `/api/register` with invalid email returns 400 with validation errors
- POST `/api/register` when email service is down creates user but logs email failure
- Verify password is hashed before storage (never stored in plaintext)

## Guidelines

- **Test behavior, not implementation** -- A test should still pass after a refactor that does not change behavior.
- **One test, one concern** -- If a test name has "and" in it, split it into two tests.
- **Tests are documentation** -- Someone reading only the tests should understand what the code does.
- **Fast tests are run more often** -- Keep unit tests fast. Save slow tests for CI.
- **Deterministic always** -- Tests must produce the same result every time. No random data, no time-dependent logic without mocking.
- **Independent tests** -- Tests must not depend on execution order or shared mutable state.
- **Match project conventions** -- Use the same test framework, file naming, and patterns as the rest of the project.
- **Do not test third-party code** -- Trust that libraries work. Test YOUR code's interaction with them.
- **Regression tests tell stories** -- A regression test's name should reference the bug it prevents: `it('does not crash when API returns error without items field (fixes #123)')`.

Related Skills

Vitest

39
from InugamiDev/ultrathink-oss

> Blazing fast unit testing powered by Vite — Jest-compatible API, native ESM, TypeScript.

testing-toolkit

39
from InugamiDev/ultrathink-oss

Unified testing methodology toolkit — Testing Library (accessible queries, user-event, component testing), unit/integration/e2e/property-based testing patterns, test strategy design (pyramid/trophy/diamond, coverage goals), test fixtures (factories, builders, seeders, snapshots), API testing (Supertest, contract testing, endpoint validation). Keeps runtime-specific runners (vitest/playwright/cypress/promptfoo) separate.

test-ui

39
from InugamiDev/ultrathink-oss

Multi-viewport UI testing with screenshots, visual regression detection, and accessibility audits

load-testing

39
from InugamiDev/ultrathink-oss

Load testing with k6, Artillery, Locust — traffic simulation, performance baselines, and stress testing.

contract-testing

39
from InugamiDev/ultrathink-oss

Consumer-driven contract testing with Pact, schema validation, and API compatibility verification.

accessibility-testing

39
from InugamiDev/ultrathink-oss

Accessibility testing with axe-core, pa11y, Lighthouse, screen reader testing, and WCAG compliance verification

ab-test-generator

39
from InugamiDev/ultrathink-oss

Generate A/B test variants for affiliate content. Triggers on: "create A/B test", "test my headline", "optimize my CTA", "generate variants", "split test ideas", "improve click-through rate", "test my landing page copy", "headline alternatives", "CTA variations", "which version is better", "optimize conversions", "test my email subject line", "compare approaches".

ultrathink

39
from InugamiDev/ultrathink-oss

UltraThink Workflow OS — 4-layer skill mesh with persistent memory and privacy hooks for complex engineering tasks. Routes prompts through intent detection to activate the right domain skills automatically.

ultrathink_review

39
from InugamiDev/ultrathink-oss

Multi-pass code review powered by UltraThink's quality gate — checks correctness, security (OWASP), performance, readability, and project conventions in a single structured pass.

ultrathink_memory

39
from InugamiDev/ultrathink-oss

Persistent memory system for UltraThink — search, save, and recall project context, decisions, and patterns across sessions using Postgres-backed fuzzy search with synonym expansion.

ui-design

39
from InugamiDev/ultrathink-oss

Comprehensive UI design system: 230+ font pairings, 48 themes, 65 design systems, 23 design languages, 30 UX laws, 14 color systems, Swiss grid, Gestalt principles, Pencil.dev workflow. Inherits ui-ux-pro-max (99 UX rules) + impeccable-frontend-design (anti-AI-slop). Triggers on any design, UI, layout, typography, color, theme, or styling task.

Zod

39
from InugamiDev/ultrathink-oss

> TypeScript-first schema validation with static type inference.