Best use case
subagent-testing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Test skills via RED/GREEN/REFACTOR TDD with fresh subagents
Teams using subagent-testing should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/nm-abstract-subagent-testing/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How subagent-testing Compares
| Feature / Agent | subagent-testing | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Test skills via RED/GREEN/REFACTOR TDD with fresh subagents
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
SKILL.md Source
> **Night Market Skill** — ported from [claude-night-market/abstract](https://github.com/athola/claude-night-market/tree/master/plugins/abstract). For the full experience with agents, hooks, and commands, install the Claude Code plugin. # Subagent Testing - TDD for Skills Test skills with fresh subagent instances to prevent priming bias and validate effectiveness. ## Table of Contents 1. [Overview](#overview) 2. [Why Fresh Instances Matter](#why-fresh-instances-matter) 3. [Testing Methodology](#testing-methodology) 4. [Quick Start](#quick-start) 5. [Detailed Testing Guide](#detailed-testing-guide) 6. [Success Criteria](#success-criteria) ## Overview **Fresh instances prevent priming:** Each test uses a new Claude conversation to verify the skill's impact is measured, not conversation history effects. ## Why Fresh Instances Matter ### The Priming Problem Running tests in the same conversation creates bias: - Prior context influences responses - Skill effects get mixed with conversation history - Can't isolate skill's true impact ### Fresh Instance Benefits - **Isolation**: Each test starts clean - **Reproducibility**: Consistent baseline state - **Measurement**: Clear before/after comparison - **Validation**: Proves skill effectiveness, not priming ## Testing Methodology Three-phase TDD-style approach: ### Phase 1: Baseline Testing (RED) Test without skill to establish baseline behavior. ### Phase 2: With-Skill Testing (GREEN) Test with skill loaded to measure improvements. ### Phase 3: Rationalization Testing (REFACTOR) Test skill's anti-rationalization guardrails. ## Quick Start ```bash # 1. Create baseline tests (without skill) # Use 5 diverse scenarios # Document full responses # 2. Create with-skill tests (fresh instances) # Load skill explicitly # Use identical prompts # Compare to baseline # 3. Create rationalization tests # Test anti-rationalization patterns # Verify guardrails work ``` ## Detailed Testing Guide For complete testing patterns, examples, and templates: - **[Testing Patterns](modules/testing-patterns.md)** - Full TDD methodology - **[Test Examples](modules/testing-patterns.md)** - Baseline, with-skill, rationalization tests - **[Analysis Templates](modules/testing-patterns.md)** - Scoring and comparison frameworks ## Success Criteria - **Baseline**: Document 5+ diverse baseline scenarios - **Improvement**: ≥50% improvement in skill-related metrics - **Consistency**: Results reproducible across fresh instances - **Rationalization Defense**: Guardrails prevent ≥80% of rationalization attempts ## See Also - **skill-authoring**: Creating effective skills - **bulletproof-skill**: Anti-rationalization patterns - **test-skill**: Automated skill testing command
Related Skills
limited-info-subagent-skill-verify
Validate whether a skill can be executed successfully by a minimally informed subagent. Use when the user wants to test a skill by giving a subagent only a minimal invocation, such as the skill name plus a single artifact like a link, file, or short prompt, and then grade whether the subagent actually performed the skill rather than merely describing it.
rust-testing-code-review
Reviews Rust test code for unit test patterns, integration test structure, async testing, mocking approaches, and property-based testing. Use when reviewing _test.rs files,
superpowers-subagent-dev
Use when executing implementation plans with independent tasks - coordinates task execution by dispatching subagents per task with verification checkpoints, adapted for OpenClaw's isolated session model
zod-testing
Testing patterns for Zod schemas using Jest and Vitest. Covers schema correctness testing, mock data generation, error assertion patterns, integration testing with API handlers and forms, snapshot testing with z.toJSONSchema(), and property-based testing. Baseline: zod ^4.0.0. Triggers on: test files for Zod schemas, zod-schema-faker imports, mentions of "test schema", "schema test", "zod mock", "zod test", or schema testing patterns.
redux-saga-testing
Write tests for Redux Sagas using redux-saga-test-plan, runSaga, and manual generator testing. Covers expectSaga (integration), testSaga (unit), providers, partial matchers, reducer integration, error simulation, and cancellation testing. Works with Jest and Vitest. Triggers on: test files for sagas, redux-saga-test-plan imports, mentions of "test saga", "saga test", "expectSaga", "testSaga", or "redux-saga-test-plan".
create-subagent
创建和管理 SubAgent(子智能体)。使用当用户需要:(1) 创建新的 SubAgent 执行特定任务,(2) 查看/管理已有的 SubAgent,(3) 终止或指导 SubAgent。支持多种预设类型:开发、研究、写作、数据分析等。
vitest-testing
Vitest testing framework patterns and best practices. Use when writing unit tests, integration tests, configuring vitest.config, mocking with vi.mock/vi.fn, using snapshots, or setting up test coverage. Triggers on describe, it, expect, vi.mock, vi.fn, beforeEach, afterEach, vitest.
swift-testing-code-review
Reviews Swift Testing code for proper use of
pydantic-ai-testing
Test PydanticAI agents using TestModel, FunctionModel, VCR cassettes, and inline snapshots. Use when writing unit tests, mocking LLM responses, or recording API interactions.
QA & Testing Engine — Complete Software Quality System
> The definitive testing methodology for AI agents. From test strategy to execution, coverage to reporting — everything you need to ship quality software.
---
name: article-factory-wechat
humanizer
Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.