research-gap-detect
Build the mutual citation graph, find connected components, identify isolated clusters, and optionally search for bridge candidates and file gap issues. Automates the manual cluster analysis workflow.
Best use case
research-gap-detect is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
It is a strong fit for teams already working in Codex.
Build the mutual citation graph, find connected components, identify isolated clusters, and optionally search for bridge candidates and file gap issues. Automates the manual cluster analysis workflow.
Teams using research-gap-detect should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/research-gap-detect/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How research-gap-detect Compares
| Feature / Agent | research-gap-detect | Standard Approach |
|---|---|---|
| Platform Support | Codex | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Build the mutual citation graph, find connected components, identify isolated clusters, and optionally search for bridge candidates and file gap issues. Automates the manual cluster analysis workflow.
Which AI agents support this skill?
This skill is designed for Codex.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agent for Product Research
Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
AI Agent for SaaS Idea Validation
Use AI agent skills for SaaS idea validation, market research, customer discovery, competitor analysis, and documenting startup hypotheses.
SKILL.md Source
# Research Gap Detect
Analyze the research corpus citation graph to find disconnected clusters, isolated papers, and gap opportunities. Optionally searches for bridge paper candidates and files gap issues.
## Triggers
- "find research gaps"
- "detect clusters"
- "cluster analysis"
- "find isolated papers"
- "bridge candidate search"
- `/research-gap-detect`
## Parameters
### `--clusters-only` (optional)
Only run cluster detection — skip bridge search and issue filing.
### `--file-issues` (optional)
Auto-file gap issues for each disconnected cluster pair.
### `--search-bridges` (optional)
Search external databases for papers that could bridge disconnected clusters.
### `--min-cluster-size N` (optional)
Minimum papers in a cluster to report. Default: 2.
### `--format` (optional)
Output format: `full` (default), `summary`, or `json`.
## Execution Flow
### Phase 1: Build Citation Graph
1. Read the citation-network index (from `/corpus-index-build --graph citation-network`)
- If stale or missing: run `/corpus-index-build --graph citation-network` first
2. Build an adjacency list from outgoing + incoming edges
3. Treat as undirected for cluster detection (A cites B ≡ A connected to B)
### Phase 2: Connected Components (BFS)
Run BFS/connected-components on the undirected citation graph:
1. Initialize: all nodes unvisited
2. For each unvisited node: BFS to find its connected component
3. Collect components sorted by size (largest first)
**Output**:
```
Connected Components: 9
Cluster 1: "Agentic Workflows" (124 papers)
Hub: REF-016 (34 connections)
Topics: agentic-workflows, multi-agent, orchestration
Sample: REF-001, REF-016, REF-024, REF-121 ...
Cluster 2: "GUI Agents" (31 papers)
Hub: REF-198 (12 connections)
Topics: gui-agents, web-agents, screen-understanding
Sample: REF-198, REF-201, REF-215 ...
...
Cluster 9: "Isolated" (3 papers)
No hub (all degree 1)
REF-299, REF-312, REF-350
```
### Phase 3: Gap Analysis
For each pair of clusters, assess the gap:
1. **Topic overlap** — do the clusters share any tags?
2. **Temporal overlap** — do they cover the same years?
3. **Author overlap** — do any authors appear in both clusters?
4. **Bridgeability** — could a single paper connect them?
Prioritize gaps by:
- **Size product** — larger clusters disconnected = higher priority
- **Topic proximity** — clusters with related but not identical topics
- **Recency** — newer clusters may simply be missing recent cross-citations
**Output**:
```
Gap Analysis: 12 cluster pairs
Priority 1: "Agentic Workflows" ↔ "GUI Agents"
Gap: 124 × 31 = 3,844 (size product)
Topic overlap: agent, llm (2 shared tags)
Bridge opportunity: HIGH
Suggested search: "LLM agent GUI interaction orchestration"
Priority 2: "Evaluation" ↔ "Reproducibility"
Gap: 45 × 28 = 1,260
Topic overlap: evaluation, benchmark (2 shared tags)
Bridge opportunity: MEDIUM
Suggested search: "reproducible LLM evaluation benchmarks"
...
```
### Phase 4: Bridge Search (if --search-bridges)
For each high-priority gap:
1. Generate search queries from cluster topic overlap
2. Search external databases (Semantic Scholar, arXiv, Google Scholar)
3. Filter candidates by:
- Cites papers from BOTH clusters
- Published in overlapping time range
- High citation count (likely to be connecting work)
4. Rank candidates by bridge potential
**Output**:
```
Bridge Candidates Found: 8
For gap "Agentic Workflows" ↔ "GUI Agents":
1. "WebAgent: World-Centric Web Navigation" (2024)
Cites: REF-016 (Cluster 1), REF-198 (Cluster 2)
Citations: 87
Bridge potential: HIGH
2. "Agent-E: Vision-Language Planning for Web Tasks" (2024)
Cites: REF-024 (Cluster 1), REF-201 (Cluster 2)
Citations: 45
Bridge potential: MEDIUM
```
### Phase 5: File Issues (if --file-issues)
For each gap with bridge candidates, file a research induction issue:
```markdown
## Research Gap: [Cluster A] ↔ [Cluster B]
**Gap Size**: [N × M papers disconnected]
**Bridge Candidates**: [list]
**Suggested Action**: Induct [top candidate] to connect clusters
### Bridge Papers to Induct
- [ ] "WebAgent: World-Centric Web Navigation" — arxiv:2401.XXXXX
- [ ] "Agent-E: Vision-Language Planning" — arxiv:2403.XXXXX
```
### Phase 6: Report
```
Research Gap Detection
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Graph: 372 nodes, 1,247 edges
Connected components: 9
Largest cluster: 124 papers ("Agentic Workflows")
Isolated papers: 3
Gap analysis: 12 cluster pairs
HIGH priority: 4 (bridge candidates available)
MEDIUM priority: 5
LOW priority: 3
Bridge candidates found: 8 papers
Issues filed: 4
Papers recommended for induction: 8
```
## Distinction from research-gap
| Tool | Approach | Output |
|------|----------|--------|
| `research-gap` | **Intellectual** — topic coverage, missing areas, GRADE gaps | Gap report with search queries |
| `research-gap-detect` | **Structural** — citation graph topology, disconnected components | Cluster map, bridge candidates, filed issues |
`research-gap` answers "what topics are we missing?" while `research-gap-detect` answers "which existing papers don't cite each other but should?"
## Examples
```bash
# Full analysis with bridge search
/research-gap-detect --search-bridges
# Just show clusters
/research-gap-detect --clusters-only
# Detect and auto-file issues
/research-gap-detect --file-issues
# Combined: search + file
/research-gap-detect --search-bridges --file-issues
# JSON for visualization
/research-gap-detect --format json
```
## References
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/corpus-index-build/SKILL.md — Builds the citation-network graph
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/citation-backfill/SKILL.md — Prerequisite: complete bidirectional edges
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/research-gap/SKILL.md — Complementary intellectual gap analysis
- @$AIWG_ROOT/agentic/code/frameworks/research-complete/skills/induct-research/SKILL.md — Inducts bridge candidatesRelated Skills
research-workflow
Execute multi-stage research workflows
research-status
Show research corpus health and statistics
research-query
Search the local research corpus, read matching findings, and synthesize an answer with inline citations to REF-XXX sources. The "query" operation for the research pipeline.
research-quality
Assess source quality using GRADE methodology
research-quality-audit
Audit research corpus for shallow stubs, incomplete sections, missing source files, and doc depth issues. Detects docs written from abstracts rather than full papers and optionally auto-dispatches expansion agents.
research-provenance
Query provenance chains and artifact relationships
research-lint
Run the research corpus lint ruleset to detect structural and referential integrity issues — orphan notes, missing frontmatter, broken references, missing GRADE assessments.
research-gap
Analyze gaps in research coverage
research-document
Generate summaries and literature notes from research papers
research-discover
Search for research papers across academic databases
research-cite
Generate properly formatted citation from research corpus
research-archive
Package research artifacts for long-term archival