Codex

research-quality-audit

Audit research corpus for shallow stubs, incomplete sections, missing source files, and doc depth issues. Detects docs written from abstracts rather than full papers and optionally auto-dispatches expansion agents.

104 stars

byjmagly

View on GitHub Installation ↓

Best use case

research-quality-audit is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

It is a strong fit for teams already working in Codex.

Teams using research-quality-audit should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/research-quality-audit/SKILL.md --create-dirs "https://raw.githubusercontent.com/jmagly/aiwg/main/.agents/skills/research-quality-audit/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/research-quality-audit/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How research-quality-audit Compares

Feature / Agent	research-quality-audit	Standard Approach
Platform Support	Codex	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Which AI agents support this skill?

This skill is designed for Codex.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agent for YouTube Script Writing

Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.

SKILL.md Source

# Research Quality Audit

Audit the research corpus for shallow stubs, incomplete documentation, and missing source files. Detects analysis docs written from abstracts alone (the root cause of the 88-stub incident) and reports doc depth metrics across the corpus.

## Triggers

- "audit research quality"
- "check for stubs"
- "find shallow docs"
- "research quality audit"
- "how deep are the analysis docs?"
- `/research-quality-audit`

## Parameters

### `--range REF-XXX:YYY` (optional)
Audit a specific range of REF identifiers. Default: entire corpus.

### `--fix` (optional)
Auto-dispatch expansion agents to deepen stubs. Each stub gets a focused agent that reads the full PDF/source and rewrites the analysis doc.

### `--threshold N` (optional)
Minimum line count for a doc to be considered non-stub. Default: 80.

### `--format` (optional)
Output format: `full` (default), `summary`, or `json`.

### `--pdf-check` (optional)
Also verify that each REF has an actual PDF or source file, not just metadata.

## Execution Flow

### Phase 1: Corpus Scan

1. **Glob** all finding docs: `.aiwg/research/findings/REF-*.md`
   (and/or `documentation/references/REF-*.md` depending on corpus layout)
2. For each doc, collect:
   - **Line count** (total lines)
   - **Content lines** (non-empty, non-frontmatter, non-heading lines)
   - **Section count** (number of `##` headings)
   - **Key quote count** (blockquotes or inline quotes)
   - **Source availability** — does the PDF exist at the referenced `pdf_location`?
   - **Full text available** — does `sources/text/REF-XXX.txt` exist?
   - **Frontmatter completeness** — required fields present?

### Phase 2: Classification

Classify each doc into quality tiers:

| Tier | Content Lines | Sections | Quotes | Verdict |
|------|-------------|----------|--------|---------|
| **Full** | >= 150 | >= 8 | >= 3 | Comprehensive analysis |
| **Adequate** | 80-149 | >= 5 | >= 1 | Meets minimum depth |
| **Stub** | 40-79 | >= 3 | 0 | Written from abstract — needs expansion |
| **Skeleton** | < 40 | any | 0 | Placeholder only — needs full rewrite |

Additional flags:
- **No PDF**: analysis exists but source PDF is missing
- **No full text**: PDF exists but text extraction was not run
- **Abstract-only indicators**: doc mentions "abstract" but no methodology/results sections

### Phase 3: Report

```
Research Quality Audit
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Corpus: 372 documents
Threshold: 80 content lines

Quality Distribution:
  Full (150+):      124 (33%)  ████████████████░░░░░░░░░░
  Adequate (80-149): 89 (24%)  ████████████░░░░░░░░░░░░░░
  Stub (40-79):      98 (26%)  █████████████░░░░░░░░░░░░░
  Skeleton (<40):    61 (16%)  ████████░░░░░░░░░░░░░░░░░░

Statistics:
  Mean content lines:  112
  Median:              94
  Min:                 12 (REF-299)
  Max:                 591 (REF-018)

Source Availability:
  PDF present:         348 / 372 (94%)
  Full text extracted:  201 / 372 (54%)
  Missing PDF:          24 papers
  Missing text:        171 papers

Stubs Requiring Expansion (159):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

  REF-253  22 lines  skeleton  No PDF    "Agentic Design Patterns"
  REF-254  35 lines  skeleton  Has PDF   "Multi-Agent Debate"
  REF-255  45 lines  stub      Has PDF   "Language Agent Tree Search"
  REF-256  48 lines  stub      No text   "ReAct: Synergizing Reasoning"
  ...

Top 10 Shallowest (candidates for immediate expansion):
  1. REF-299  12 lines  skeleton  "Toolformer: Language Models Can..."
  2. REF-312  15 lines  skeleton  "WebArena: A Realistic Web..."
  3. REF-253  22 lines  skeleton  "Agentic Design Patterns..."
  ...
```

### Phase 4: Auto-Fix (if --fix)

When `--fix` is specified:

1. **Filter fixable stubs** — only expand docs that have a PDF or full text available
2. **Batch by priority** — shallowest docs first, batch into groups of 10
3. **Dispatch expansion agents** — each agent:
   - Reads the full PDF/extracted text for the source
   - Rewrites the analysis doc with comprehensive content
   - Target: 150+ content lines with methodology, findings, limitations, key quotes
4. **Re-audit after expansion** — run Phase 1-3 again to verify improvements
5. **Report** — docs expanded, mean line improvement, remaining stubs

```
Auto-Fix Results:
  Dispatched: 10 expansion agents (batch 1 of 16)
  Expanded: 10 / 10
  Mean improvement: 77 → 161 lines (+109%)
  Remaining stubs: 149

  Run again with --fix to process next batch.
```

## Integration Points

| Component | Relationship |
|-----------|-------------|
| `induct-research` | Quality audit should auto-run after batch induction |
| `corpus-snapshot` | Gates on stub rate > 10% (#814) |
| `research-lint` | `ref-frontmatter` rule catches incomplete metadata; quality-audit catches shallow content |
| `research-status` | Doc depth is a component of corpus health scoring |
| `research-acquire` | For stubs with missing PDFs, triggers acquisition before expansion |

## Distinction from Other Tools

| Tool | What it checks |
|------|---------------|
| `research-lint` | **Structural** — frontmatter fields, naming, references resolve |
| `research-quality-audit` | **Depth** — is the content substantive? Was the source actually read? |
| `research-quality` | **Evidence** — GRADE assessment of the source's research quality |
| `corpus-health` | **Aggregate** — overall corpus metrics including depth, structure, coverage |

## Examples

```bash
# Full corpus audit
/research-quality-audit

# Audit specific range
/research-quality-audit --range REF-253:372

# Auto-expand stubs (batch of 10)
/research-quality-audit --fix

# Strict threshold (120 lines minimum)
/research-quality-audit --threshold 120

# Check source file availability
/research-quality-audit --pdf-check

# JSON for programmatic use
/research-quality-audit --format json
```

## References

- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/induct-research/SKILL.md — Source of stubs when acquisition is skipped
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-acquire/SKILL.md — Acquires PDFs for stub expansion
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-lint/SKILL.md — Structural validation (complementary)
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-quality/SKILL.md — GRADE evidence assessment (complementary)
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-status/SKILL.md — Health scoring includes depth metrics

Related Skills

Quality Filtering

104

from jmagly/aiwg

Accept/reject logic and quality scoring heuristics for media content

security-audit

104

from jmagly/aiwg

Perform comprehensive security assessment

Codex

research-workflow

104

from jmagly/aiwg

Execute multi-stage research workflows

Codex

research-status

104

from jmagly/aiwg

Show research corpus health and statistics

Codex

research-query

104

from jmagly/aiwg

Search the local research corpus, read matching findings, and synthesize an answer with inline citations to REF-XXX sources. The "query" operation for the research pipeline.

Codex