test
Test planning, generation, and execution -- unit, integration, and end-to-end testing workflows
Best use case
test is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Test planning, generation, and execution -- unit, integration, and end-to-end testing workflows
Teams using test should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/test/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How test Compares
| Feature / Agent | test | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Test planning, generation, and execution -- unit, integration, and end-to-end testing workflows
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Test Skill
## Purpose
Plan, generate, and execute tests that verify code correctness, prevent regressions, and document expected behavior. This skill covers the full testing lifecycle from strategy through execution.
Tests are not bureaucracy. Tests are executable documentation that proves your code works.
## Workflow
### Mode 1: Test Planning
Use when you need to decide WHAT to test before writing tests.
1. **Identify the testing target** -- What code/feature needs tests?
2. **Read the target code** -- Understand what it does, its inputs, outputs, and edge cases.
3. **Detect existing test infrastructure**
- What test framework is in use? (Jest, Vitest, pytest, Go testing, etc.)
- Where do tests live? (co-located, separate `tests/` directory, `__tests__` folders)
- What testing utilities exist? (test helpers, factories, mocks, fixtures)
- What is the test runner command?
4. **Categorize test needs**:
- **Unit tests**: Individual functions/methods in isolation
- **Integration tests**: Multiple components working together
- **E2E tests**: Full user flows through the system
5. **Produce the test plan**:
```markdown
# Test Plan: [Target]
## Target
[What is being tested]
## Test Infrastructure
- **Framework**: [name + version]
- **Runner command**: [command]
- **Test location**: [where tests should go]
## Unit Tests
| Test | Input | Expected Output | Edge Case? |
|------|-------|----------------|------------|
| [function] handles valid input | [input] | [output] | No |
| [function] handles empty input | [] | [] or error | Yes |
| [function] handles null | null | throws TypeError | Yes |
## Integration Tests
| Test | Components | Scenario |
|------|-----------|----------|
| [feature] happy path | A + B + C | [description] |
| [feature] error propagation | A + B | [description] |
## Edge Cases to Cover
- [edge case 1]
- [edge case 2]
## Not Testing (and why)
- [thing]: [reason -- e.g., "covered by upstream tests"]
```
### Mode 2: Test Generation
Use when you need to WRITE tests.
1. **Read the target code** thoroughly.
2. **Read existing tests** in the same area to match conventions.
3. **Generate tests** following these principles:
- **Arrange-Act-Assert** (AAA) pattern for every test
- **One assertion per test** (prefer focused tests over multi-assert tests)
- **Descriptive test names** that read as specifications: `it('returns 404 when user does not exist')`
- **Test behavior, not implementation** -- tests should not break when internals are refactored
- **Cover the testing triangle**:
- Happy path (normal correct behavior)
- Edge cases (empty inputs, boundaries, max values)
- Error cases (invalid inputs, network failures, permission errors)
- Null/undefined handling
4. **Test file structure**:
```typescript
describe('[ModuleName]', () => {
describe('[functionName]', () => {
// Setup shared across this function's tests
beforeEach(() => { /* ... */ });
it('does X when given Y', () => {
// Arrange
const input = createTestInput();
// Act
const result = functionName(input);
// Assert
expect(result).toEqual(expectedOutput);
});
it('throws when given invalid input', () => {
expect(() => functionName(null)).toThrow(ValidationError);
});
it('handles edge case: empty array', () => {
const result = functionName([]);
expect(result).toEqual([]);
});
});
});
```
5. **Mocking strategy**:
- Mock external dependencies (APIs, databases, file system)
- Do NOT mock the code under test
- Prefer dependency injection over module mocking where possible
- Use factory functions for test data, not inline literals
- Reset mocks between tests
6. **Write the test file** using the Write tool (new file) or Edit tool (adding to existing).
### Mode 3: Test Execution
Use when you need to RUN tests and interpret results.
1. **Determine the test command** from project configuration.
2. **Run tests**:
- Specific test file: `npm test -- path/to/test.ts`
- Specific test: `npm test -- --testNamePattern="test name"`
- Full suite: `npm test`
- With coverage: `npm test -- --coverage`
3. **Interpret results**:
- **All pass**: Report success with summary
- **Failures**: For each failure:
- What test failed?
- What was expected vs. actual?
- Is this a test bug or a code bug?
- **Errors**: Test infrastructure problems (missing deps, config issues)
4. **If tests fail**:
- Determine if the test is correct and the code is wrong (hand off to `fix`)
- Or if the code is correct and the test needs updating (fix the test)
- Or if there is a test infrastructure issue (fix the setup)
### Mode 4: Coverage Analysis
1. **Run with coverage** enabled.
2. **Identify uncovered areas**:
- Uncovered lines/branches in the target code
- Missing edge case coverage
- Untested error paths
3. **Prioritize coverage gaps** by risk:
- High risk: Error handling, security-related, data mutation
- Medium risk: Business logic branches, edge cases
- Low risk: Logging, formatting, display-only code
4. **Generate additional tests** for high-priority gaps.
## Usage
### Plan tests
```
/test Plan tests for the authentication module
```
### Write tests
```
/test Write unit tests for src/lib/utils/formatDate.ts
```
### Run tests
```
/test Run the test suite and report results
```
### After a fix
```
/test Write a regression test for the bug fixed in src/lib/orders.ts
```
### Coverage improvement
```
/test Improve test coverage for the payment processing module
```
## Examples
### Example: Unit test generation
**Target**: `formatCurrency(amount: number, currency: string): string`
**Generated tests**:
- `formatCurrency(10.5, 'USD')` returns `'$10.50'`
- `formatCurrency(0, 'USD')` returns `'$0.00'`
- `formatCurrency(-5, 'USD')` returns `'-$5.00'`
- `formatCurrency(1000000, 'USD')` returns `'$1,000,000.00'`
- `formatCurrency(10.5, 'EUR')` returns `'EUR10.50'` (or locale-appropriate)
- `formatCurrency(10.5, 'INVALID')` throws `UnsupportedCurrencyError`
- `formatCurrency(NaN, 'USD')` throws `InvalidAmountError`
- `formatCurrency(Infinity, 'USD')` throws `InvalidAmountError`
### Example: Integration test
**Target**: User registration flow (API endpoint + database + email)
**Generated tests**:
- POST `/api/register` with valid data creates user and sends welcome email
- POST `/api/register` with existing email returns 409 Conflict
- POST `/api/register` with invalid email returns 400 with validation errors
- POST `/api/register` when email service is down creates user but logs email failure
- Verify password is hashed before storage (never stored in plaintext)
## Guidelines
- **Test behavior, not implementation** -- A test should still pass after a refactor that does not change behavior.
- **One test, one concern** -- If a test name has "and" in it, split it into two tests.
- **Tests are documentation** -- Someone reading only the tests should understand what the code does.
- **Fast tests are run more often** -- Keep unit tests fast. Save slow tests for CI.
- **Deterministic always** -- Tests must produce the same result every time. No random data, no time-dependent logic without mocking.
- **Independent tests** -- Tests must not depend on execution order or shared mutable state.
- **Match project conventions** -- Use the same test framework, file naming, and patterns as the rest of the project.
- **Do not test third-party code** -- Trust that libraries work. Test YOUR code's interaction with them.
- **Regression tests tell stories** -- A regression test's name should reference the bug it prevents: `it('does not crash when API returns error without items field (fixes #123)')`.Related Skills
Vitest
> Blazing fast unit testing powered by Vite — Jest-compatible API, native ESM, TypeScript.
testing-toolkit
Unified testing methodology toolkit — Testing Library (accessible queries, user-event, component testing), unit/integration/e2e/property-based testing patterns, test strategy design (pyramid/trophy/diamond, coverage goals), test fixtures (factories, builders, seeders, snapshots), API testing (Supertest, contract testing, endpoint validation). Keeps runtime-specific runners (vitest/playwright/cypress/promptfoo) separate.
test-ui
Multi-viewport UI testing with screenshots, visual regression detection, and accessibility audits
load-testing
Load testing with k6, Artillery, Locust — traffic simulation, performance baselines, and stress testing.
contract-testing
Consumer-driven contract testing with Pact, schema validation, and API compatibility verification.
accessibility-testing
Accessibility testing with axe-core, pa11y, Lighthouse, screen reader testing, and WCAG compliance verification
ab-test-generator
Generate A/B test variants for affiliate content. Triggers on: "create A/B test", "test my headline", "optimize my CTA", "generate variants", "split test ideas", "improve click-through rate", "test my landing page copy", "headline alternatives", "CTA variations", "which version is better", "optimize conversions", "test my email subject line", "compare approaches".
ultrathink
UltraThink Workflow OS — 4-layer skill mesh with persistent memory and privacy hooks for complex engineering tasks. Routes prompts through intent detection to activate the right domain skills automatically.
ultrathink_review
Multi-pass code review powered by UltraThink's quality gate — checks correctness, security (OWASP), performance, readability, and project conventions in a single structured pass.
ultrathink_memory
Persistent memory system for UltraThink — search, save, and recall project context, decisions, and patterns across sessions using Postgres-backed fuzzy search with synonym expansion.
ui-design
Comprehensive UI design system: 230+ font pairings, 48 themes, 65 design systems, 23 design languages, 30 UX laws, 14 color systems, Swiss grid, Gestalt principles, Pencil.dev workflow. Inherits ui-ux-pro-max (99 UX rules) + impeccable-frontend-design (anti-AI-slop). Triggers on any design, UI, layout, typography, color, theme, or styling task.
Zod
> TypeScript-first schema validation with static type inference.