testing
Comprehensive testing specialization covering test strategy, automation, TDD methodology, test writing, and web app testing. Use when setting up test infrastructure, writing tests, implementing TDD workflows, analyzing coverage, integrating tests into CI/CD, or testing web applications with Playwright. Framework-agnostic approach with framework-specific guidance via reference files.
Best use case
testing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Comprehensive testing specialization covering test strategy, automation, TDD methodology, test writing, and web app testing. Use when setting up test infrastructure, writing tests, implementing TDD workflows, analyzing coverage, integrating tests into CI/CD, or testing web applications with Playwright. Framework-agnostic approach with framework-specific guidance via reference files.
Teams using testing should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/testing-majiayu000/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How testing Compares
| Feature / Agent | testing | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Comprehensive testing specialization covering test strategy, automation, TDD methodology, test writing, and web app testing. Use when setting up test infrastructure, writing tests, implementing TDD workflows, analyzing coverage, integrating tests into CI/CD, or testing web applications with Playwright. Framework-agnostic approach with framework-specific guidance via reference files.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Testing
This skill provides comprehensive testing capabilities including test strategy, automation setup, Test-Driven Development (TDD), test writing best practices, coverage analysis, CI/CD integration, and web application testing with Playwright.
## When to Use This Skill
- When setting up test infrastructure for a project
- When creating test strategies and test plans
- When writing unit, integration, or E2E tests
- When implementing TDD/test-first development
- When analyzing test coverage and quality
- When integrating tests into CI/CD pipelines
- When testing web applications with Playwright
- When debugging test failures or improving test reliability
- When writing test fixtures, mock data, or factory functions
- When mocking external dependencies (APIs, databases, file systems)
- When organizing test file structure and test suites
- When testing async code, Promises, or event-driven behavior
- When implementing snapshot tests for UI components
- When configuring test coverage thresholds
## What This Skill Does
1. **Test Strategy**: Designs comprehensive testing strategies (unit, integration, E2E)
2. **Test Automation**: Sets up test frameworks and automation tools
3. **TDD Methodology**: Implements Test-Driven Development workflows (Red-Green-Refactor)
4. **Test Writing**: Writes focused, maintainable tests with proper patterns
5. **Coverage Analysis**: Analyzes and improves test coverage
6. **CI/CD Integration**: Integrates tests into continuous integration pipelines
7. **Web App Testing**: Tests web applications using Playwright
8. **Test Quality**: Improves test reliability and maintainability
## Test Strategy
### Test Pyramid
**Recommended Distribution:**
- **Unit Tests**: 70% - Fast, isolated, test individual functions
- **Integration Tests**: 20% - Test component interactions
- **E2E Tests**: 10% - Test complete user workflows
**Test Types:**
- Functional tests (happy path, edge cases, error handling)
- Non-functional tests (performance, security, accessibility)
- Regression tests (prevent breaking changes)
- Smoke tests (critical path verification)
### Framework Selection
**JavaScript/TypeScript:**
- Jest, Vitest, Mocha for unit/integration
- Playwright, Cypress for E2E
- React Testing Library for component testing
**Python:**
- pytest for unit/integration
- Selenium, Playwright for E2E
- unittest for standard library testing
**Java:**
- JUnit for unit tests
- TestNG for integration
- Selenium for E2E
**Go:**
- Built-in testing package
- Testify for assertions
**Rust:**
- Built-in test framework
- Cargo test for running tests
## Test-Driven Development (TDD)
TDD is a **design technique**, not just a testing technique. It produces better-designed, more maintainable code through small, disciplined steps.
### Core Principle
**Write tests before code. Always.** TDD forces you to think about:
- What behavior do I need?
- How will I know it works?
- What's the simplest implementation?
### The Three Laws (Never Violate)
1. **Write NO production code** without a failing test first
2. **Write only enough test** to demonstrate one failure
3. **Write only enough code** to pass that test
### Red-Green-Refactor Cycle
**Phase 1: RED - Write Failing Test**
1. Write ONE test that defines desired behavior
2. Run test - verify it FAILS
3. Verify it fails for the RIGHT reason (not syntax error)
4. DO NOT write implementation yet
**Phase 2: GREEN - Minimal Implementation**
1. Write MINIMAL code to make test pass
2. Resist urge to add extra features
3. Run test - verify it PASSES
4. If test still fails, fix implementation (not test)
**Phase 3: REFACTOR - Clean Code**
1. Remove code duplication (DRY)
2. Improve naming for clarity
3. Extract complex logic into functions
4. Run ALL tests - must stay green throughout
5. Check test coverage on changed lines
After REFACTOR, start new RED phase for next behavior.
## Test Writing Patterns
### Arrange-Act-Assert (AAA)
**Structure:**
1. **Arrange**: Set up test data and conditions
2. **Act**: Execute the code being tested
3. **Assert**: Verify the expected outcome
**Example:**
```javascript
describe('UserService', () => {
it('should create user with valid data', async () => {
// Arrange
const userData = { email: 'test@example.com', name: 'Test User' };
// Act
const result = await userService.createUser(userData);
// Assert
expect(result).toHaveProperty('id');
expect(result.email).toBe(userData.email);
});
});
```
### Given-When-Then (BDD Style)
**Structure:**
1. **Given**: Initial context/preconditions
2. **When**: Action/event that triggers behavior
3. **Then**: Expected outcome
### Test Organization
**File Structure:**
```
project/
├── src/
│ └── components/
│ └── User.jsx
├── tests/
│ ├── unit/
│ │ └── User.test.jsx
│ ├── integration/
│ │ └── UserAPI.test.js
│ └── e2e/
│ └── user-flow.spec.js
├── jest.config.js
└── playwright.config.js
```
## Coverage Analysis
### Coverage Goals
**Recommended Thresholds:**
- **Lines**: 80%+
- **Functions**: 80%+
- **Branches**: 80%+
- **Statements**: 80%+
**Critical Paths:**
- Always aim for 100% coverage on critical business logic
- Authentication and authorization
- Payment processing
- Data validation
### Coverage Gaps
**Common Gaps:**
- Error handling paths
- Edge cases
- Boundary conditions
- Integration points
**Improvement Strategies:**
- Identify untested code paths
- Add tests for error scenarios
- Test edge cases and boundaries
- Increase integration test coverage
## CI/CD Integration
### Test Pipeline
**Stages:**
1. **Unit Tests**: Fast feedback, run on every commit
2. **Integration Tests**: Run on pull requests
3. **E2E Tests**: Run before merging to main
4. **Performance Tests**: Run on main branch
**Quality Gates:**
- All tests must pass
- Coverage must meet threshold
- No critical security issues
- Performance benchmarks met
## Web Application Testing with Playwright
### Helper Scripts
This skill includes Python helper scripts in `scripts/`:
- **`with_server.py`** - Manages server lifecycle (supports multiple servers). Always run with `--help` first to see usage.
```bash
# Single server
python scripts/with_server.py --server "npm run dev" --port 5173 -- python your_automation.py
# Multiple servers (e.g., backend + frontend)
python scripts/with_server.py \
--server "cd backend && python server.py" --port 3000 \
--server "cd frontend && npm run dev" --port 5173 \
-- python your_automation.py
```
### Decision Tree: Choosing Your Approach
```
User task → Is it static HTML?
├─ Yes → Read HTML file directly to identify selectors
│ ├─ Success → Write Playwright script using selectors
│ └─ Fails/Incomplete → Treat as dynamic (below)
│
└─ No (dynamic webapp) → Is the server already running?
├─ No → Run: python scripts/with_server.py --help
│ Then use the helper + write simplified Playwright script
│
└─ Yes → Reconnaissance-then-action:
1. Navigate and wait for networkidle
2. Take screenshot or inspect DOM
3. Identify selectors from rendered state
4. Execute actions with discovered selectors
```
### Playwright Best Practices
- **Use bundled scripts as black boxes** - Use `--help` to see usage, then invoke directly
- Use `sync_playwright()` for synchronous scripts
- Always close the browser when done
- Use descriptive selectors: `text=`, `role=`, CSS selectors, or IDs
- Add appropriate waits: `page.wait_for_selector()` or `page.wait_for_timeout()`
- **CRITICAL**: Wait for `page.wait_for_load_state('networkidle')` before inspection on dynamic apps
### Example: Basic Playwright Script
```python
from playwright.sync_api import sync_playwright
with sync_playwright() as p:
browser = p.chromium.launch(headless=True)
page = browser.new_page()
page.goto('http://localhost:5173')
page.wait_for_load_state('networkidle') # CRITICAL: Wait for JS to execute
# ... your automation logic
browser.close()
```
### Examples
See `examples/` directory for:
- `element_discovery.py` - Discovering buttons, links, and inputs on a page
- `static_html_automation.py` - Using file:// URLs for local HTML
- `console_logging.py` - Capturing console logs during automation
## Reference Files
For detailed testing patterns and workflows, load reference files as needed:
- **`references/framework_workflows.md`** - Framework-specific TDD workflows and examples for Python (pytest), JavaScript (Jest, Vitest), Java (JUnit), Go, Rust
- **`references/test_patterns.md`** - Common test patterns, test organization, naming conventions, test doubles (mocks, stubs, spies), parametrization, and anti-patterns
- **`references/webapp_testing.md`** - Web application testing patterns, Playwright best practices, and E2E testing strategies
- **`references/TESTING_REPORT.template.md`** - Test quality report template with coverage metrics, audit findings, and recommendations
When working with specific frameworks or need detailed patterns, load the appropriate reference file.
## Best Practices
### Test Quality
1. **Isolation**: Tests should be independent and runnable in any order
2. **Deterministic**: Tests should produce consistent results
3. **Fast**: Unit tests should run quickly (< 100ms each)
4. **Clear**: Test names should describe what they test
5. **Maintainable**: Tests should be easy to update when code changes
### TDD Best Practices
1. **One Behavior Per Test**: Each test verifies ONE behavior
2. **Descriptive Names**: Test names describe the behavior being tested
3. **Independent Tests**: Tests don't depend on each other
4. **Fast Tests**: Mock external dependencies to keep tests fast
5. **Clear Assertions**: Assertions clearly show what's being verified
### Common Mistakes to Avoid
- ❌ Writing multiple tests at once (write one test at a time)
- ❌ Skipping refactor phase (always refactor after green)
- ❌ Implementation before test (delete code and start with test)
- ❌ Over-engineering in GREEN (simplest thing that passes)
- ❌ Writing test that passes immediately (must fail first)
### Test Maintenance
- Review and update tests when requirements change
- Remove obsolete tests
- Refactor tests to reduce duplication
- Keep test data factories up to date
- Monitor test execution time
## Integration with Other Skills
- **debugging**: Use when tests fail unexpectedly
- **code-review**: TDD produces code that's easier to review
- **dead-code-removal**: Tests help identify unused code
- **performance**: Use for performance testing strategies
## Meta-Principle
```
TDD is a DESIGN technique, not a testing technique.
The cycle never changes: RED → GREEN → REFACTOR → Repeat
Writing tests first forces you to think about:
- What behavior do I need?
- How will I know it works?
- What's the simplest implementation?
This produces better-designed, more maintainable code.
```Related Skills
act-local-testing
Use when testing GitHub Actions workflows locally with act. Covers act CLI usage, Docker configuration, debugging workflows, and troubleshooting common issues when running workflows on your local machine.
web-security-testing
Web application security testing workflow for OWASP Top 10 vulnerabilities including injection, XSS, authentication flaws, and access control issues.
web-app-testing
Gemini 2.5 Computer Use for browser automation with VISIBLE local browser. Watch Gemini AI control your browser in real-time. Perfect for web app testing, automation demos, and debugging.
wallaby-testing
Check test status and debug failing tests using Wallaby.js real-time test results. Use after making code changes to verify tests pass, when checking if tests are failing, debugging test errors, analyzing assertions, inspecting runtime values, checking coverage, updating snapshots, or when user mentions Wallaby, tests, coverage, or test status.
unit-testing-test-generate
Generate comprehensive, maintainable unit tests across languages with strong coverage and edge case focus.
treido-testing
Testing specialist for Treido (Playwright + Next.js). Use for writing/debugging E2E tests, deflaking, selectors, auth state, parallel execution, and CI stability.
testing-workflow
Meta-skill that orchestrates comprehensive testing across a project by coordinating testing-patterns, e2e-testing, and testing agents. Use when setting up testing for a new project, improving coverage for an existing project, establishing a testing strategy, or verifying quality before a release.
testing-strategy
Comprehensive guide for implementing AIDB tests following E2E-first philosophy, DebugInterface abstraction, and MCP response health standards
testing-strategies
Testing strategies, patterns, and best practices for production code
testing-services
Writes unit tests for Python service classes using Arrange-Act-Assert pattern with proper mocking at boundaries. Tests behavior, not implementation. Mocks external systems only (API calls, file I/O, databases). Use when writing tests for services or fixing test coverage.
testing-quality
Plans and executes comprehensive testing strategy across frontend, backend, and AI tiers. Activates when writing tests, testing features, setting up test infrastructure, checking coverage, running E2E tests, or performance testing. Does not handle writing production code (backend-developer or frontend-developer), vulnerability/security review (security), or infrastructure deployment (devops).
testing-patterns
Testing patterns using bun:test with in-memory SQLite. Use when writing unit tests, integration tests, or router tests.