kaizen:root-cause-tracing
Use when errors occur deep in execution and you need to trace back to find the original trigger - systematically traces bugs backward through call stack, adding instrumentation when needed, to identify source of invalid data or incorrect behavior
Best use case
kaizen:root-cause-tracing is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Use when errors occur deep in execution and you need to trace back to find the original trigger - systematically traces bugs backward through call stack, adding instrumentation when needed, to identify source of invalid data or incorrect behavior
Teams using kaizen:root-cause-tracing should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/root-cause-tracing/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How kaizen:root-cause-tracing Compares
| Feature / Agent | kaizen:root-cause-tracing | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Use when errors occur deep in execution and you need to trace back to find the original trigger - systematically traces bugs backward through call stack, adding instrumentation when needed, to identify source of invalid data or incorrect behavior
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Root Cause Tracing
## Overview
Bugs often manifest deep in the call stack (git init in wrong directory, file created in wrong location, database opened with wrong path). Your instinct is to fix where the error appears, but that's treating a symptom.
**Core principle:** Trace backward through the call chain until you find the original trigger, then fix at the source.
## When to Use
```dot
digraph when_to_use {
"Bug appears deep in stack?" [shape=diamond];
"Can trace backwards?" [shape=diamond];
"Fix at symptom point" [shape=box];
"Trace to original trigger" [shape=box];
"BETTER: Also add defense-in-depth" [shape=box];
"Bug appears deep in stack?" -> "Can trace backwards?" [label="yes"];
"Can trace backwards?" -> "Trace to original trigger" [label="yes"];
"Can trace backwards?" -> "Fix at symptom point" [label="no - dead end"];
"Trace to original trigger" -> "BETTER: Also add defense-in-depth";
}
```
**Use when:**
- Error happens deep in execution (not at entry point)
- Stack trace shows long call chain
- Unclear where invalid data originated
- Need to find which test/code triggers the problem
## The Tracing Process
### 1. Observe the Symptom
```
Error: git init failed in /Users/jesse/project/packages/core
```
### 2. Find Immediate Cause
**What code directly causes this?**
```typescript
await execFileAsync('git', ['init'], { cwd: projectDir });
```
### 3. Ask: What Called This?
```typescript
WorktreeManager.createSessionWorktree(projectDir, sessionId)
→ called by Session.initializeWorkspace()
→ called by Session.create()
→ called by test at Project.create()
```
### 4. Keep Tracing Up
**What value was passed?**
- `projectDir = ''` (empty string!)
- Empty string as `cwd` resolves to `process.cwd()`
- That's the source code directory!
### 5. Find Original Trigger
**Where did empty string come from?**
```typescript
const context = setupCoreTest(); // Returns { tempDir: '' }
Project.create('name', context.tempDir); // Accessed before beforeEach!
```
## Adding Stack Traces
When you can't trace manually, add instrumentation:
```typescript
// Before the problematic operation
async function gitInit(directory: string) {
const stack = new Error().stack;
console.error('DEBUG git init:', {
directory,
cwd: process.cwd(),
nodeEnv: process.env.NODE_ENV,
stack,
});
await execFileAsync('git', ['init'], { cwd: directory });
}
```
**Critical:** Use `console.error()` in tests (not logger - may not show)
**Run and capture:**
```bash
npm test 2>&1 | grep 'DEBUG git init'
```
**Analyze stack traces:**
- Look for test file names
- Find the line number triggering the call
- Identify the pattern (same test? same parameter?)
## Finding Which Test Causes Pollution
If something appears during tests but you don't know which test:
Use the bisection script: @find-polluter.sh
```bash
./find-polluter.sh '.git' 'src/**/*.test.ts'
```
Runs tests one-by-one, stops at first polluter. See script for usage.
## Real Example: Empty projectDir
**Symptom:** `.git` created in `packages/core/` (source code)
**Trace chain:**
1. `git init` runs in `process.cwd()` ← empty cwd parameter
2. WorktreeManager called with empty projectDir
3. Session.create() passed empty string
4. Test accessed `context.tempDir` before beforeEach
5. setupCoreTest() returns `{ tempDir: '' }` initially
**Root cause:** Top-level variable initialization accessing empty value
**Fix:** Made tempDir a getter that throws if accessed before beforeEach
**Also added defense-in-depth:**
- Layer 1: Project.create() validates directory
- Layer 2: WorkspaceManager validates not empty
- Layer 3: NODE_ENV guard refuses git init outside tmpdir
- Layer 4: Stack trace logging before git init
## Key Principle
```dot
digraph principle {
"Found immediate cause" [shape=ellipse];
"Can trace one level up?" [shape=diamond];
"Trace backwards" [shape=box];
"Is this the source?" [shape=diamond];
"Fix at source" [shape=box];
"Add validation at each layer" [shape=box];
"Bug impossible" [shape=doublecircle];
"NEVER fix just the symptom" [shape=octagon, style=filled, fillcolor=red, fontcolor=white];
"Found immediate cause" -> "Can trace one level up?";
"Can trace one level up?" -> "Trace backwards" [label="yes"];
"Can trace one level up?" -> "NEVER fix just the symptom" [label="no"];
"Trace backwards" -> "Is this the source?";
"Is this the source?" -> "Trace backwards" [label="no - keeps going"];
"Is this the source?" -> "Fix at source" [label="yes"];
"Fix at source" -> "Add validation at each layer";
"Add validation at each layer" -> "Bug impossible";
}
```
**NEVER fix just where the error appears.** Trace back to find the original trigger.
## Stack Trace Tips
**In tests:** Use `console.error()` not logger - logger may be suppressed
**Before operation:** Log before the dangerous operation, not after it fails
**Include context:** Directory, cwd, environment variables, timestamps
**Capture stack:** `new Error().stack` shows complete call chain
## Real-World Impact
From debugging session (2025-10-03):
- Found root cause through 5-level trace
- Fixed at source (getter validation)
- Added 4 layers of defense
- 1847 tests passed, zero pollutionRelated Skills
kaizen:analyse
Auto-selects best Kaizen method (Gemba Walk, Value Stream, or Muda) for target
kaizen:kaizen
Use when Code implementation and refactoring, architecturing or designing systems, process and workflow improvements, error handling and validation. Provide tehniquest to avoid over-engineering and apply iterative improvements.
kaizen:why
Iterative Five Whys root cause analysis drilling from symptoms to fundamentals
kaizen:plan-do-check-act
Iterative PDCA cycle for systematic experimentation and continuous improvement
kaizen:analyse-problem
Comprehensive A3 one-page problem analysis with root cause and action plan
distributed-tracing
Implement distributed tracing with Jaeger and Tempo to track requests across microservices and identify performance bottlenecks. Use when debugging microservices, analyzing request flows, or implem...
rootly-automation
Automate Rootly tasks via Rube MCP (Composio). Always search tools first for current schemas.
cost-optimization
Optimize cloud costs through resource rightsizing, tagging strategies, reserved instances, and spending analysis. Use when reducing cloud expenses, analyzing infrastructure costs, or implementing c...
copilot-usage-metrics
Retrieve and display GitHub Copilot usage metrics for organizations and enterprises using the GitHub CLI and REST API.
copilot-sdk
Build agentic applications with GitHub Copilot SDK. Use when embedding AI agents in apps, creating custom tools, implementing streaming responses, managing sessions, connecting to MCP servers, or creating custom agents. Triggers on Copilot SDK, GitHub SDK, agentic app, embed Copilot, programmable agent, MCP server, custom agent.
copilot-instructions-blueprint-generator
Technology-agnostic blueprint generator for creating comprehensive copilot-instructions.md files that guide GitHub Copilot to produce code consistent with project standards, architecture patterns, and exact technology versions by analyzing existing codebase patterns and avoiding assumptions.
copilot-docs
Configure GitHub Copilot with custom instructions. Use when setting up .github/copilot-instructions.md, customizing Copilot behavior, or creating repository-specific AI guidance. Triggers on Copilot instructions, copilot-instructions.md, GitHub Copilot config.