hack-through-testing
Manual invocation only. Drive a crashy, hanging, or half-broken system forward along a real production user path using real data. Two subskills: `prepare` to analyze the target and set up `<htt-home>/` with infrastructure dirs (logs, runs, issues); optionally creates `<htt-home>/autotest/` with automatic scripts and interactive guides only when the developer explicitly requests test-case generation. `run` drives testing — with or without autotest artifacts — patching forward through blockers. Run subskill operates in-place by default (stash + test on current branch) or in a disposable snapshot worktree when explicitly requested. Supports automatic and interactive driving. Default when ambiguous: both subskills, in-place, automatic. Not for CI-oriented unit, smoke, or mock-based integration tests.
Best use case
hack-through-testing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Manual invocation only. Drive a crashy, hanging, or half-broken system forward along a real production user path using real data. Two subskills: `prepare` to analyze the target and set up `<htt-home>/` with infrastructure dirs (logs, runs, issues); optionally creates `<htt-home>/autotest/` with automatic scripts and interactive guides only when the developer explicitly requests test-case generation. `run` drives testing — with or without autotest artifacts — patching forward through blockers. Run subskill operates in-place by default (stash + test on current branch) or in a disposable snapshot worktree when explicitly requested. Supports automatic and interactive driving. Default when ambiguous: both subskills, in-place, automatic. Not for CI-oriented unit, smoke, or mock-based integration tests.
Teams using hack-through-testing should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/hack-through-testing/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How hack-through-testing Compares
| Feature / Agent | hack-through-testing | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Manual invocation only. Drive a crashy, hanging, or half-broken system forward along a real production user path using real data. Two subskills: `prepare` to analyze the target and set up `<htt-home>/` with infrastructure dirs (logs, runs, issues); optionally creates `<htt-home>/autotest/` with automatic scripts and interactive guides only when the developer explicitly requests test-case generation. `run` drives testing — with or without autotest artifacts — patching forward through blockers. Run subskill operates in-place by default (stash + test on current branch) or in a disposable snapshot worktree when explicitly requested. Supports automatic and interactive driving. Default when ambiguous: both subskills, in-place, automatic. Not for CI-oriented unit, smoke, or mock-based integration tests.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Hack Through Testing Manual invocation only: use this skill only when the developer explicitly wants this workflow. Drive a fragile system to the end by patching forward instead of solving each issue cleanly on first contact. Keep each workaround reviewable and finish with a synthesis that guides the real implementation. If the developer wants testing without code changes, use `test-and-log` instead. If the developer wants a slow, stepwise session with approval before each action, use `do-interactive-test` instead. ## Testing Philosophy: Production-Level End-to-End, Not CI Hack-through-testing targets **production-level end-to-end paths**: real data, real user workflows, real API calls, real output artifacts. It is not a CI smoke run, not a unit test harness, and not a mock-based integration check. The distinction matters for choosing what to test: - **Do target**: a full user workflow from input to final output; a real data pipeline with actual inputs; a multi-step interaction flow a real user would perform; an end-to-end scenario covering multiple system components in concert. - **Do not target**: existing unit tests, existing smoke tests, isolated module tests, test suites that stub or mock external dependencies — these are already CI's job. **If the only testable surface you can identify is CI-style** (unit tests, smoke scripts, mock integrations), **stop and ask the developer** what the real production user path or end-to-end scenario is before proceeding. For example: > I can see unit/smoke/integration tests already covered by CI. What's the real production user path you want to exercise — the end-to-end scenario, the live data workflow, or a specific user journey? Do not invent a CI-style test run and call it hack-through-testing. ## Subskills This skill has exactly two subskills. 1. `prepare` Analyze the target and set up `<htt-home>/` with infrastructure dirs (logs, runs, issues). If the developer explicitly requests test-case generation (e.g., "prepare test cases", "prepare for auto test", "create autotest", "set up autotest"), also create `<htt-home>/autotest/` with automatic scripts and interactive guides. Primary guide: `references/prepare.md` 2. `run` Drive testing, patching forward through blockers. Uses autotest artifacts when they exist; otherwise drives testing directly from the target analysis. Operates in-place (default) or in a disposable snapshot worktree. Primary guide: `references/run.md` ## Subskill Selection - "prepare", "bootstrap", "plan", "set up" → `prepare` (infrastructure only, no autotest) - "prepare test cases", "prepare for auto test", "create autotest", "set up autotest", "create test cases" → `prepare` with autotest generation - "run", "test", "execute", "drive", "patch forward" → `run` only - If the developer asks for both, or does not specify → run `prepare` then `run` - If autotest artifacts already exist and the developer wants to go straight to driving → `run` only ## Shared Defaults Unless the developer says otherwise, use these defaults across both subskills: - **Target**: if the developer does not point to a specific file, directory, command, or entrypoint — or says "repo root", "workspace root", "here", "current dir" — use the agent's current working directory as the target - **Topic slug**: derive from the target; normalize to hyphen-case; keep stable for the whole session - **HTT home** (`htt-home`): `<repo-root>/.agent-automation/hacktest/<topic-slug>` unless the developer explicitly sets `htt-home=<dir>` - **Log root**: `<htt-home>/logs` - **Runs root**: `<htt-home>/runs` - **Session log**: `<log-root>/<ts>.md` - **Issue notes**: `<log-root>/issues/<ts>-<what>.md` — one file per underlying issue; append later fixes to the same note rather than creating a new one - **Run artifacts**: `<runs-root>/<run-ts>/` — always copy generated artifacts here so they survive session cleanup - **Issue IDs**: `HT-01`, `HT-02`, ... - **Commit message format (worktree mode)** / **Stash message format (in-place mode)**: `hack-through: <issue-id> <short workaround>` - **Stopping rule**: first successful end-to-end run, 10 distinct issues, or 90 minutes — whichever comes first - **Data realism**: use real data, real inputs, and live API calls wherever safe. Synthetic or stubbed inputs are a last resort. Never default to a CI-style smoke run unless the developer explicitly asks for one. - **Path references in logs**: always use repo-relative paths plus commit SHAs. Logs must remain useful after the session ends. If this skill creates `.agent-automation/hacktest/`, add it to `.gitignore`. If `.gitignore` already has commented `.agent-automation/hacktest` entries, do not auto-add the rule. ## Guardrails - Never lose or overwrite the developer's current uncommitted state. In in-place mode, always stash before starting and record the stash ref. Never drop the initial stash until the developer explicitly requests cleanup. - In worktree mode, never perform hack-through edits in the original checkout after the snapshot worktree exists. - Never commit helper-managed log or autotest artifacts to `htt-branch` (worktree mode) or mix them into stash snapshots (in-place mode). - Never present a temporary workaround as the final fix. - Never keep going silently after a workaround invalidates trust in later observations; log the caveat. - In worktree mode, never merge the throwaway branch into real work. - Never reduce interactive guides (`case-<id>.md`) to wrappers that just say "run the automatic script"; they must be independent step-by-step procedures. - Never skip the prepare subskill silently when the run subskill needs htt-home infrastructure and it is missing. ## Resources - `./scripts/create_snapshot_worktree.sh`: Create a snapshot branch and separate worktree without touching the active checkout. - [references/prepare.md](./references/prepare.md): Prepare subskill guide. - [references/run.md](./references/run.md): Run subskill guide. - [references/log-template.md](./references/log-template.md): Session log and synthesis template. - [references/issue-template.md](./references/issue-template.md): Per-issue note template. - [references/git-snapshot-plumbing.md](./references/git-snapshot-plumbing.md): Git plumbing reference for the non-invasive snapshot technique. ## Example Prompts - `Use $hack-through-testing on this CLI and keep patching forward until the happy path finishes.` - `Use $hack-through-testing to prepare for the demo under scripts/demo/foo.` (infrastructure only) - `Use $hack-through-testing to prepare autotest cases for the demo under scripts/demo/foo.` (with autotest generation) - `Use $hack-through-testing in run mode with interactive driving — I want to watch each step.` - `Use $hack-through-testing with a worktree so my checkout stays clean.` - `Use $hack-through-testing to exercise the full build-then-launch sequence — build a brain, start a session, send a prompt, stop — and patch through every failure.` - `Use $hack-through-testing, but pause if the only workaround would change the protocol or persistent data format.` - `Use $hack-through-testing with htt-home=/tmp/my-htt-home so the whole session state is easy to revisit later.` - `Use $hack-through-testing to prepare test cases, then run them automatically in a shadow repo, stop after 8 blockers.`
Related Skills
openspec-ext-hack-through-test
Manual invocation only. OpenSpec-specific hack-through-testing workflow targeting production-level end-to-end paths using real data and real user workflows — not CI smoke/unit/integration tests. Three subskills: `propose` to create an OpenSpec change with HTT-ready test cases (automatic scripts and interactive guides) by invoking `openspec-propose` or `openspec-ff-change`, `revise` to update an existing OpenSpec change so its artifacts support hack-through-testing-driven implementation and testing, and `run` to exercise an implemented OpenSpec change through the full hack-through-testing loop (in-place by default, or in a disposable snapshot worktree when requested). Use when the user explicitly asks for `openspec-ext-hack-through-test`, points to `openspec/changes/...` while asking to propose, revise, run, exercise, or prepare work under hack-through-testing principles, or wants OpenSpec work shaped for fast blocker discovery through patch-forward testing.
pixi-make-offline-channel
Use when the user wants to create a self-hosted, offline-installable Conda channel (mirror) containing a specific subset of packages using Pixi.
pixi-make-cu-build-env
Guides the agent to setup a new or existing Pixi environment for compiling C++ and CUDA code. It ensures the correct compilers, toolkits, and CMake configurations are in place for a robust user-space build.
pixi-install-nvidia
Use when the user says "use pixi to install <some nvidia tool>" (or similar) and wants NVIDIA/CUDA/GPU packages installed via Pixi (no sudo/apt), e.g., CUDA toolkit pieces, cuDNN/NCCL, PyTorch CUDA builds, RAPIDS.
pei-docker-usage
Helper for PeiDocker (`pei-docker-cli`). Trigger ONLY when the user explicitly requests PeiDocker usage OR when working within a PeiDocker-generated project (indicated by `user_config.yml`).
conan-basic-usage
Basic operations for the Conan C++ package manager. Use when the user explicitly asks to 'use conan' for tasks like creating projects, installing dependencies, or building packages, or asks for 'how to' guidance on Conan setup.
explore-dnn-model
Manual invocation only; use only when the user explicitly requests `explore-dnn-model` by name. Explore how to run a given DNN model checkpoint in the current Python environment by locating weights + upstream source code, resolving dependencies with user confirmation, running reproducible experiments under `tmp/`, and producing reports about I/O contracts, timing, and profiling.
openspec-ext-revise-by-decision
Manual invocation only; use only when the user explicitly requests `openspec-ext-revise-by-decision` by exact name. Revise OpenSpec change artifacts from a review or decision document that contains questions plus `DECISION` blocks, applying chosen decisions from a review file such as `openspec/changes/<change>/review/review-*.md` back into proposal, design, specs, and tasks.
openspec-ext-review-plan
Review an OpenSpec change (or a single OpenSpec change artifact file) for completeness, coherence, and alignment with existing system design; capture actionable feedback plus open questions; write a review report under the change directory (review/review-YYYYMMDD-HHMMSS.md).
openspec-ext-respond-to-review
Read an OpenSpec review report critically, evaluate the reviewer's proposals and findings against the current change artifacts and repository context, and write developer-owned final decisions/responses back into the review document. Use when the user explicitly mentions `openspec` or points to a path under `openspec/` while asking to examine a review report carefully, decide open questions, respond to findings, fill `DECISION` blocks, respond to an OpenSpec review file, or record final answers in an OpenSpec review document without yet revising the proposal, design, specs, or tasks.
openspec-ext-explain
Create or update OpenSpec change explanation docs that capture developer-facing questions and answers under `openspec/changes/.../explain/`. Use when the user explicitly mentions `openspec` or points to a path under `openspec/` while asking to create, update, document, or maintain a Q&A, FAQ, explain note, or question-and-answer doc for an OpenSpec change based on user questions, implementation notes, review questions, or current chat context.
deepface-basic-usage
Basic usage guide for the DeepFace library (face recognition, verification, analysis).