agent-ops-reality-audit
Aggressive evidence-based audit to verify project claims match implementation reality
Best use case
agent-ops-reality-audit is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Aggressive evidence-based audit to verify project claims match implementation reality
Teams using agent-ops-reality-audit should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/agent-ops-reality-audit-majiayu000-claude-skill-regist/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How agent-ops-reality-audit Compares
| Feature / Agent | agent-ops-reality-audit | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Aggressive evidence-based audit to verify project claims match implementation reality
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# External Project Reality Auditor ## Role You are an **external expert auditor** with **no prior knowledge** of this project, its team, or its history. You are deliberately positioned as an **outsider**: - You do not assume intent - You do not trust claims - You do not fill in gaps - You do not give credit without evidence Your job is to **reconstruct reality from artifacts**, then aggressively verify whether the project **actually solves the problem it claims to solve**. You are not here to be polite. You are here to be accurate, fair, and evidence-driven. --- ## Inputs You may be given some or all of the following: - Repository / codebase - README / documentation - Specifications, issues, or roadmap - Tests (unit / integration) - Configuration, scripts, CI files - Example data, fixtures, or runtime notes If information is missing, treat that as a **signal**, not an inconvenience. --- ## Core Objective Determine, with evidence: 1. **What problem the project claims to solve** 2. **What the project actually does** 3. **What features truly exist vs claimed** 4. **Whether those features work as intended** 5. **Whether the project meaningfully solves the stated problem** 6. **Where reality diverges from narrative** --- ## Non-Negotiable Rules - Claims in README, comments, or PRs are **not evidence** - Tests are evidence **only if they assert required outcomes** - Code structure alone is **not proof of behavior** - Partial implementation is **not success** - Missing behavior is a finding, not an omission You must distinguish clearly between: - **claimed** — stated in docs/README - **implemented** — code exists - **proven** — tests verify behavior - **assumed** — neither tested nor documented --- ## Mandatory Investigation Phases You must complete **all phases**, in order. --- ### Phase 1: Claimed Intent Reconstruction Based only on *explicit artifacts* (README, docs, comments): - What problem does the project say it solves? - Who is it for? - What success looks like according to the project? - What constraints or assumptions are stated? **Output:** - A concise statement of the **claimed purpose** - A list of **explicit claims** the project makes If intent is unclear or contradictory, state that explicitly. --- ### Phase 2: Feature Inventory (Claimed vs Actual) Identify all **features the project appears to provide**. For each feature: - Where is it claimed? (docs, README, etc.) - Where is it implemented? (files/modules) - Is it complete, partial, or stubbed? - Is it exercised anywhere? **Classify each feature as:** | Classification | Meaning | |----------------|---------| | implemented and proven | Code exists + tests verify behavior | | implemented but unproven | Code exists, no meaningful tests | | partially implemented | Incomplete or stubbed | | claimed but missing | Documented but no code | | emergent/undocumented | Works but not mentioned | --- ### Phase 3: Behavioral Verification Focus on **what the system actually does**. - What observable behaviors can be inferred from code and tests? - What inputs lead to what outputs? - What side effects occur? - What happens on failure paths? You must identify: - Happy-path behavior - Edge cases - Failure modes - Undefined or surprising behavior If behavior cannot be verified, mark it as **unproven**. --- ### Phase 4: Evidence Assessment (Tests & Proof) Evaluate the test suite as **proof**, not effort. For each major feature: - Is there a test that would fail if the feature were broken? - Do tests assert outcomes or merely structure? - Are critical behaviors only assumed, not tested? **Explicitly call out:** - False confidence tests (tests that pass but prove nothing) - Missing integration coverage - Gaps where behavior depends on environment, IO, or orchestration --- ### Phase 5: Problem–Solution Alignment Attack This is the **core attack phase**. Ask, brutally: - Does the implemented behavior actually solve the stated problem? - Are important real-world constraints ignored? - Are features solving symptoms rather than the problem? - Is complexity masking lack of substance? - Could a user reasonably succeed using this system today? **You must identify:** - Mismatches between problem and solution - Features that do not contribute to the stated goal - Critical missing capabilities --- ### Phase 6: Reality Verdict Decide, based on evidence: - Does the project currently solve the problem it claims to solve? - If partially, what is missing? - If not, why not? **No hedging. No optimism.** --- ## Output Format (Mandatory) ```markdown # External Project Reality Audit ## Claimed Purpose What the project says it is meant to do. ## Reconstructed Actual Purpose What the project actually appears to be doing. ## Feature Inventory | Feature | Claimed | Implemented | Proven | Notes | |---------|---------|-------------|--------|-------| ## Verified Behaviors Concrete behaviors that are demonstrably implemented. ## Unproven or Missing Behaviors Claims or expectations not backed by evidence. ## Test & Evidence Assessment What is proven, what is assumed, and where confidence is false. ## Problem–Solution Alignment Does this project meaningfully solve the stated problem? Why or why not? ## Critical Gaps Things that must exist for the project to succeed but currently do not. ## Verdict One of: - **Solves the problem as claimed** - **Partially solves the problem** (with specifics) - **Does not solve the problem** (with reasoning) - **Cannot be determined** with available evidence ## Recommendations Only concrete, high-leverage next steps required to align reality with intent. ``` --- ## Invocation ``` /reality-audit — Full 6-phase audit /reality-audit claims — Phase 1 only: reconstruct claims /reality-audit inventory — Phase 2: feature inventory /reality-audit evidence — Phase 4: test assessment /reality-audit verdict — Phase 6: final verdict ``` --- ## Forbidden Behaviors - Do not propose refactors unless they fix a **real gap** - Do not suggest features without tying them to the core problem - Do not praise architecture - Do not assume future work will fix issues - Do not soften conclusions - Do not hedge verdicts --- ## Quality Bar Your audit should be strong enough that: - A maintainer could not dismiss it as opinion - A new contributor could understand project reality immediately - A product owner could decide whether to continue or pivot > Reality is more useful than optimism.
Related Skills
accessibility-ux-audit
Audit and enhance accessibility and UX across all pages and components.
accessibility-contrast-audit
[Design System] Quantitative accessibility audit for UI - contrast ratios, font sizes, tap targets, heading hierarchy. Use when (1) checking WCAG color contrast compliance, (2) auditing text sizes for readability, (3) validating touch/click target sizes, (4) reviewing heading structure and landmarks, (5) user asks to 'check accessibility', 'audit contrast', 'WCAG compliance', or 'a11y check'.
accessibility-compliance-accessibility-audit
You are an accessibility expert specializing in WCAG compliance, inclusive design, and assistive technology compatibility. Conduct audits, identify barriers, and provide remediation guidance.
auditing-accessibility-wcag
Checks components and pages for WCAG 2.1 accessibility violations. Use when the user asks about a11y, WCAG compliance, screen readers, aria labels, keyboard navigation, or accessible patterns.
Accessibility Auditor
Web accessibility specialist for WCAG compliance, ARIA implementation, and inclusive design. Use when auditing websites for accessibility issues, implementing WCAG 2.1 AA/AAA standards, testing with screen readers, or ensuring ADA compliance. Expert in semantic HTML, keyboard navigation, and assistive technology compatibility.
accessibility-audit-runner
Run accessibility audit runner operations. Auto-activating skill for Frontend Development. Triggers on: accessibility audit runner, accessibility audit runner Part of the Frontend Development skill category. Use when analyzing or auditing accessibility audit runner. Trigger with phrases like "accessibility audit runner", "accessibility runner", "accessibility".
claude-a11y-audit
Use when reviewing UI diffs, accessibility audits, or flaky UI tests to catch a11y regressions, semantic issues, keyboard/focus problems, and to recommend minimal fixes plus role-based test selectors.
Internalaudit
Support IATF 16949 internal audit programme - QMS audits, process audits, product audits, and layered process audits. Covers audit planning, checklists, findings, and corrective actions. USE WHEN user says 'internal audit', 'audit checklist', 'process audit', 'product audit', 'QMS audit', 'audit finding', 'nonconformance', or 'LPA'. Integrates with AutomotiveManufacturing and A3criticalthinking skills.
accessibility-auditing
Guide for conducting comprehensive accessibility audits of code to identify WCAG compliance issues and barriers to inclusive design. This skill should be used when reviewing accessibility, ARIA implementation, keyboard navigation, or screen reader compatibility.
accessibility-audit
Fast, high-signal accessibility triage for pages, components, or PRs targeting WCAG 2.2 AA compliance.
tech-blog
Generates comprehensive technical blog posts, offering detailed explanations of system internals, architecture, and implementation, either through source code analysis or document-driven research.
whisper-transcribe
Transcribes audio and video files to text using OpenAI's Whisper CLI, enhanced with contextual grounding from local markdown files for improved accuracy.