analyze-project
Forensic root cause analyzer for Antigravity sessions. Classifies scope deltas, rework patterns, root causes, hotspots, and auto-improves prompts/health.
Best use case
analyze-project is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Forensic root cause analyzer for Antigravity sessions. Classifies scope deltas, rework patterns, root causes, hotspots, and auto-improves prompts/health.
Forensic root cause analyzer for Antigravity sessions. Classifies scope deltas, rework patterns, root causes, hotspots, and auto-improves prompts/health.
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "analyze-project" skill to help with this workflow task. Context: Forensic root cause analyzer for Antigravity sessions. Classifies scope deltas, rework patterns, root causes, hotspots, and auto-improves prompts/health.
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/analyze-project/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How analyze-project Compares
| Feature / Agent | analyze-project | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Forensic root cause analyzer for Antigravity sessions. Classifies scope deltas, rework patterns, root causes, hotspots, and auto-improves prompts/health.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Best AI Skills for ChatGPT
Find the best AI skills to adapt into ChatGPT workflows for research, writing, summarization, planning, and repeatable assistant tasks.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# /analyze-project — Root Cause Analyst Workflow Analyze AI-assisted coding sessions in `~/.gemini/antigravity/brain/` and produce a report that explains not just **what happened**, but **why it happened**, **who/what caused it**, and **what should change next time**. ## Goal For each session, determine: 1. What changed from the initial ask to the final executed work 2. Whether the main cause was: - user/spec - agent - repo/codebase - validation/testing - legitimate task complexity 3. Whether the opening prompt was sufficient 4. Which files/subsystems repeatedly correlate with struggle 5. What changes would most improve future sessions ## When to Use - You need a postmortem on AI-assisted coding sessions, especially when scope drift or repeated rework occurred. - You want root-cause analysis that separates user/spec issues from agent mistakes, repo friction, or validation gaps. - You need evidence-backed recommendations for improving future prompts, repo health, or delivery workflows. ## Global Rules - Treat `.resolved.N` counts as **iteration signals**, not proof of failure - Separate **human-added scope**, **necessary discovered scope**, and **agent-introduced scope** - Separate **agent error** from **repo friction** - Every diagnosis must include **evidence** and **confidence** - Confidence levels: - **High** = direct artifact/timestamp evidence - **Medium** = multiple supporting signals - **Low** = plausible inference, not directly proven - Evidence precedence: - artifact contents > timestamps > metadata summaries > inference - If evidence is weak, say so --- ## Step 0.5: Session Intent Classification Classify the primary session intent from objective + artifacts: - `DELIVERY` - `DEBUGGING` - `REFACTOR` - `RESEARCH` - `EXPLORATION` - `AUDIT_ANALYSIS` Record: - `session_intent` - `session_intent_confidence` Use intent to contextualize severity and rework shape. Do not judge exploratory or research sessions by the same standards as narrow delivery sessions. --- ## Step 1: Discover Conversations 1. Read available conversation summaries from system context 2. List conversation folders in the user’s Antigravity `brain/` directory 3. Build a conversation index with: - `conversation_id` - `title` - `objective` - `created` - `last_modified` 4. If the user supplied a keyword/path, filter to matching conversations; otherwise analyze all Output: indexed list of conversations to analyze. --- ## Step 2: Extract Session Evidence For each conversation, read if present: ### Core artifacts - `task.md` - `implementation_plan.md` - `walkthrough.md` ### Metadata - `*.metadata.json` ### Version snapshots - `task.md.resolved.0 ... N` - `implementation_plan.md.resolved.0 ... N` - `walkthrough.md.resolved.0 ... N` ### Additional signals - other `.md` artifacts - timestamps across artifact updates - file/folder/subsystem names mentioned in plans/walkthroughs - validation/testing language - explicit acceptance criteria, constraints, non-goals, and file targets Record per conversation: #### Lifecycle - `has_task` - `has_plan` - `has_walkthrough` - `is_completed` - `is_abandoned_candidate` = task exists but no walkthrough #### Revision / change volume - `task_versions` - `plan_versions` - `walkthrough_versions` - `extra_artifacts` #### Scope - `task_items_initial` - `task_items_final` - `task_completed_pct` - `scope_delta_raw` - `scope_creep_pct_raw` #### Timing - `created_at` - `completed_at` - `duration_minutes` #### Content / quality - `objective_text` - `initial_plan_summary` - `final_plan_summary` - `initial_task_excerpt` - `final_task_excerpt` - `walkthrough_summary` - `mentioned_files_or_subsystems` - `validation_requirements_present` - `acceptance_criteria_present` - `non_goals_present` - `scope_boundaries_present` - `file_targets_present` - `constraints_present` --- ## Step 3: Prompt Sufficiency Score the opening request on a 0–2 scale for: - **Clarity** - **Boundedness** - **Testability** - **Architectural specificity** - **Constraint awareness** - **Dependency awareness** Create: - `prompt_sufficiency_score` - `prompt_sufficiency_band` = High / Medium / Low Then note which missing prompt ingredients likely contributed to later friction. Do not punish short prompts by default; a narrow, obvious task can still have high sufficiency. --- ## Step 4: Scope Change Classification Classify scope change into: - **Human-added scope** — new asks beyond the original task - **Necessary discovered scope** — work required to complete the original task correctly - **Agent-introduced scope** — likely unnecessary work introduced by the agent Record: - `scope_change_type_primary` - `scope_change_type_secondary` (optional) - `scope_change_confidence` - evidence Keep one short example in mind for calibration: - Human-added: “also refactor nearby code while you’re here” - Necessary discovered: hidden dependency must be fixed for original task to work - Agent-introduced: extra cleanup or redesign not requested and not required --- ## Step 5: Rework Shape Classify each session into one primary pattern: - **Clean execution** - **Early replan then stable finish** - **Progressive scope expansion** - **Reopen/reclose churn** - **Late-stage verification churn** - **Abandoned mid-flight** - **Exploratory / research session** Record: - `rework_shape` - `rework_shape_confidence` - evidence --- ## Step 6: Root Cause Analysis For every non-clean session, assign: ### Primary root cause One of: - `SPEC_AMBIGUITY` - `HUMAN_SCOPE_CHANGE` - `REPO_FRAGILITY` - `AGENT_ARCHITECTURAL_ERROR` - `VERIFICATION_CHURN` - `LEGITIMATE_TASK_COMPLEXITY` ### Secondary root cause Optional if materially relevant ### Root-cause guidance - **SPEC_AMBIGUITY**: opening ask lacked boundaries, targets, criteria, or constraints - **HUMAN_SCOPE_CHANGE**: scope expanded because the user broadened the task - **REPO_FRAGILITY**: hidden coupling, brittle files, unclear architecture, or environment issues forced extra work - **AGENT_ARCHITECTURAL_ERROR**: wrong files, wrong assumptions, wrong approach, hallucinated structure - **VERIFICATION_CHURN**: implementation mostly worked, but testing/validation caused loops - **LEGITIMATE_TASK_COMPLEXITY**: revisions were expected for the difficulty and not clearly avoidable Every root-cause assignment must include: - evidence - why stronger alternative causes were rejected - confidence --- ## Step 6.5: Session Severity Scoring (0–100) Assign each session a severity score to prioritize attention. Components (sum, clamp 0–100): - **Completion failure**: 0–25 (`abandoned = 25`) - **Replanning intensity**: 0–15 - **Scope instability**: 0–15 - **Rework shape severity**: 0–15 - **Prompt sufficiency deficit**: 0–10 (`low = 10`) - **Root cause impact**: 0–10 (`REPO_FRAGILITY` / `AGENT_ARCHITECTURAL_ERROR` highest) - **Hotspot recurrence**: 0–10 Bands: - **0–19 Low** - **20–39 Moderate** - **40–59 Significant** - **60–79 High** - **80–100 Critical** Record: - `session_severity_score` - `severity_band` - `severity_drivers` = top 2–4 contributors - `severity_confidence` Use severity as a prioritization signal, not a verdict. Always explain the drivers. Contextualize severity using session intent so research/exploration sessions are not over-penalized. --- ## Step 7: Subsystem / File Clustering Across all conversations, cluster repeated struggle by file, folder, or subsystem. For each cluster, calculate: - number of conversations touching it - average revisions - completion rate - abandonment rate - common root causes - average severity Goal: identify whether friction is mostly prompt-driven, agent-driven, or concentrated in specific repo areas. --- ## Step 8: Comparative Cohorts Compare: - first-shot successes vs re-planned sessions - completed vs abandoned - high prompt sufficiency vs low prompt sufficiency - narrow-scope vs high-scope-growth - short sessions vs long sessions - low-friction subsystems vs high-friction subsystems For each comparison, identify: - what differs materially - which prompt traits correlate with smoother execution - which repo traits correlate with repeated struggle Do not just restate averages; extract cautious evidence-backed patterns. --- ## Step 9: Non-Obvious Findings Generate 3–7 findings that are not simple metric restatements. Each finding must include: - observation - why it matters - evidence - confidence Examples of strong findings: - replans cluster around weak file targeting rather than weak acceptance criteria - scope growth often begins after initial success, suggesting post-success human expansion - auth-related struggle is driven more by repo fragility than agent hallucination --- ## Step 10: Report Generation Create `session_analysis_report.md` with this structure: # 📊 Session Analysis Report — [Project Name] **Generated**: [timestamp] **Conversations Analyzed**: [N] **Date Range**: [earliest] → [latest] ## Executive Summary | Metric | Value | Rating | |:---|:---|:---| | First-Shot Success Rate | X% | 🟢/🟡/🔴 | | Completion Rate | X% | 🟢/🟡/🔴 | | Avg Scope Growth | X% | 🟢/🟡/🔴 | | Replan Rate | X% | 🟢/🟡/🔴 | | Median Duration | Xm | — | | Avg Session Severity | X | 🟢/🟡/🔴 | | High-Severity Sessions | X / N | 🟢/🟡/🔴 | Thresholds: - First-shot: 🟢 >70 / 🟡 40–70 / 🔴 <40 - Scope growth: 🟢 <15 / 🟡 15–40 / 🔴 >40 - Replan rate: 🟢 <20 / 🟡 20–50 / 🔴 >50 Avg severity guidance: - 🟢 <25 - 🟡 25–50 - 🔴 >50 Note: avg severity is an aggregate health signal, not the same as per-session severity bands. Then add a short narrative summary of what is going well, what is breaking down, and whether the main issue is prompt quality, repo fragility, workflow discipline, or validation churn. ## Root Cause Breakdown | Root Cause | Count | % | Notes | |:---|:---|:---|:---| ## Prompt Sufficiency Analysis - common traits of high-sufficiency prompts - common missing inputs in low-sufficiency prompts - which missing prompt ingredients correlate most with replanning or abandonment ## Scope Change Analysis Separate: - Human-added scope - Necessary discovered scope - Agent-introduced scope ## Rework Shape Analysis Summarize the main failure patterns across sessions. ## Friction Hotspots Show the files/folders/subsystems most associated with replanning, abandonment, verification churn, and high severity. ## First-Shot Successes List the cleanest sessions and extract what made them work. ## Non-Obvious Findings List 3–7 evidence-backed findings with confidence. ## Severity Triage List the highest-severity sessions and say whether the best intervention is: - prompt improvement - scope discipline - targeted skill/workflow - repo refactor / architecture cleanup - validation/test harness improvement ## Recommendations For each recommendation, use: - **Observed pattern** - **Likely cause** - **Evidence** - **Change to make** - **Expected benefit** - **Confidence** ## Per-Conversation Breakdown | # | Title | Intent | Duration | Scope Δ | Plan Revs | Task Revs | Root Cause | Rework Shape | Severity | Complete? | |:---|:---|:---|:---|:---|:---|:---|:---|:---|:---|:---| --- ## Step 11: Optional Post-Analysis Improvements If appropriate, also: - update any local project-health or memory artifact (if present) with recurring failure modes and fragile subsystems - generate `prompt_improvement_tips.md` from high-sufficiency / first-shot-success sessions - suggest missing skills or workflows when the same subsystem or task sequence repeatedly causes struggle Only recommend workflows/skills when the pattern appears repeatedly. --- ## Final Output Standard The workflow must produce: 1. metrics summary 2. root-cause diagnosis 3. prompt-sufficiency assessment 4. subsystem/friction map 5. severity triage and prioritization 6. evidence-backed recommendations 7. non-obvious findings Prefer explicit uncertainty over fake precision. ## Limitations - Use this skill only when the task clearly matches the scope described above. - Do not treat the output as a substitute for environment-specific validation, testing, or expert review. - Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.
Related Skills
project-skill-audit
Audit a project and recommend the highest-value skills to add or update.
weightloss-analyzer
分析减肥数据、计算代谢率、追踪能量缺口、管理减肥阶段
travel-health-analyzer
分析旅行健康数据、评估目的地健康风险、提供疫苗接种建议、生成多语言紧急医疗信息卡片。支持WHO/CDC数据集成的专业级旅行健康风险评估。
tcm-constitution-analyzer
分析中医体质数据、识别体质类型、评估体质特征,并提供个性化养生建议。支持与营养、运动、睡眠等健康数据的关联分析。
systems-programming-rust-project
You are a Rust project architecture expert specializing in scaffolding production-ready Rust applications. Generate complete project structures with cargo tooling, proper module organization, testing
startup-business-analyst-financial-projections
Create detailed 3-5 year financial model with revenue, costs, cash flow, and scenarios
sred-project-organizer
Take a list of projects and their related documentation, and organize them into the SRED format for submission.
sleep-analyzer
分析睡眠数据、识别睡眠模式、评估睡眠质量,并提供个性化睡眠改善建议。支持与其他健康数据的关联分析。
skin-health-analyzer
Analyze skin health data, identify skin problem patterns, assess skin health status. Supports correlation analysis with nutrition, chronic diseases, and medication data.
sexual-health-analyzer
Sexual Health Analyzer
seek-and-analyze-video
Seek and analyze video content using Memories.ai Large Visual Memory Model for persistent video intelligence
rehabilitation-analyzer
分析康复训练数据、识别康复模式、评估康复进展,并提供个性化康复建议