AI Agent Skill HUB

Codex

research-provenance

Query provenance chains and artifact relationships

104 stars

View on GitHub Installation ↓

Best use case

research-provenance is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

It is a strong fit for teams already working in Codex.

Query provenance chains and artifact relationships

Teams using research-provenance should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/research-provenance/SKILL.md --create-dirs "https://raw.githubusercontent.com/jmagly/aiwg/main/.agents/skills/research-provenance/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/research-provenance/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How research-provenance Compares

Feature / Agent	research-provenance	Standard Approach
Platform Support	Codex	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Query provenance chains and artifact relationships

Which AI agents support this skill?

This skill is designed for Codex.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agent for YouTube Script Writing

Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.

AI Agent for Product Research

Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.

SKILL.md Source

# Research Provenance Command

Query W3C PROV-compliant provenance chains to trace artifact derivations and relationships.

## Instructions

When invoked, perform provenance queries:

1. **Load Provenance Records**
   - If REF-XXX, load all provenance records for that paper
   - If artifact path, load records for that specific artifact
   - Build provenance graph from PROV relationships

2. **Execute Query**
   - Query types:
     - `what-derives` - What artifacts derive from this source?
     - `what-cites` - What documents cite this source?
     - `history` - Full derivation history of this artifact
     - `lineage` - Complete lineage from source to current state
     - `impact` - Impact analysis (what would be affected by changes?)

3. **Traverse Graph**
   - Follow `wasDerivedFrom` relationships
   - Follow `wasGeneratedBy` activities
   - Follow `wasAssociatedWith` agents
   - Collect all related entities, activities, and agents

4. **Format Results**
   - Visualize as tree, graph, or table
   - Show derivation chains
   - Highlight gaps or inconsistencies
   - Calculate impact metrics

5. **Report**
   - Display provenance chain
   - Show relationship types
   - Include timestamps and agents
   - Flag any provenance issues

## Arguments

- `[ref-id or path]` - Source identifier or artifact path (required)
- `--query [what-derives|what-cites|history|lineage|impact]` - Query type (default: what-derives)
- `--depth [n]` - Maximum graph traversal depth (default: 5)
- `--output [tree|graph|table|json]` - Output format (default: tree)
- `--validate` - Validate provenance chain integrity
- `--export-dot` - Export as GraphViz DOT format

## Examples

```bash
# Find what derives from a paper
/research-provenance REF-022 --query what-derives

# Find citation usage
/research-provenance REF-022 --query what-cites

# Get full history of an artifact
/research-provenance .aiwg/architecture/agent-orchestration-sad.md --query history

# Analyze impact of changes
/research-provenance REF-022 --query impact --depth 10

# Validate provenance chain
/research-provenance REF-022 --validate

# Export as graph
/research-provenance REF-022 --query lineage --export-dot
```

## Expected Output

### What Derives Query

```
/research-provenance REF-022 --query what-derives

Provenance Query: REF-022 - What Derives From This Source?
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Source: REF-022 (AutoGen: Enabling Next-Gen LLM Applications...)

Derivation Tree:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

REF-022.pdf (source paper)
│
├─→ REF-022-autogen.md (finding document)
│   │   Relationship: wasDerivedFrom
│   │   Activity: documentation
│   │   Agent: documentation-agent
│   │   Date: 2026-02-03T12:15:00Z
│   │
│   ├─→ REF-022-notes.md (literature notes)
│   │   Relationship: wasDerivedFrom
│   │   Activity: synthesis
│   │   Agent: documentation-agent
│   │   Date: 2026-02-03T12:20:00Z
│   │
│   └─→ UC-174-conversable-agent.md (use case)
│       Relationship: wasInformedBy
│       Activity: requirements_analysis
│       Agent: requirements-analyst
│       Date: 2026-02-03T13:00:00Z
│
├─→ REF-022-assessment.yaml (quality assessment)
│   Relationship: wasDerivedFrom
│   Activity: quality_assessment
│   Agent: quality-agent
│   Date: 2026-02-03T12:30:00Z
│
└─→ .claude/rules/conversable-agent-interface.md (implementation rule)
    Relationship: wasInformedBy
    Activity: rule_creation
    Agent: architect
    Date: 2026-02-03T14:00:00Z

Summary:
  Total derived artifacts: 5
  Derivation depth: 2 levels
  Agents involved: 4 (documentation-agent, quality-agent, requirements-analyst, architect)
  Time span: 2026-02-03 12:15 - 14:00 (1h 45m)
```

### What Cites Query

```
/research-provenance REF-022 --query what-cites

Provenance Query: REF-022 - Citation Usage
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Source: REF-022 (AutoGen: Enabling Next-Gen LLM Applications...)

Citation Map:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Documents citing REF-022:

1. .aiwg/architecture/agent-orchestration-sad.md
   Lines: 78, 142, 267
   Context: Multi-agent conversation patterns
   Quality compliance: ✓ APPROPRIATE (MODERATE hedging for LOW evidence)

2. .aiwg/requirements/use-cases/UC-174-conversable-agent.md
   Lines: 23, 45
   Context: Conversable agent interface design
   Quality compliance: ✓ APPROPRIATE

3. .claude/rules/conversable-agent-interface.md
   Lines: 12, 89
   Context: Research foundation for agent protocol
   Quality compliance: ✓ APPROPRIATE

4. .claude/rules/auto-reply-chains.md
   Lines: 15, 34, 67
   Context: Auto-reply pattern implementation
   Quality compliance: ✓ APPROPRIATE

5. docs/agent-framework.md
   Lines: 78
   Context: Agent capabilities overview
   Quality compliance: ✗ VIOLATION - "Research demonstrates" too strong for LOW evidence
   Suggestion: Change to "Limited evidence suggests"

6. .aiwg/architecture/adr-012-agent-protocol.md
   Lines: 45
   Context: Protocol design rationale
   Quality compliance: ✗ VIOLATION - "Studies prove" too strong
   Suggestion: Change to "Preliminary findings indicate"

Summary:
  Total citations: 12 (across 6 documents)
  Compliant citations: 10 (83%)
  Policy violations: 2 (17%)
  Remediation needed: docs/agent-framework.md, .aiwg/architecture/adr-012-agent-protocol.md
```

### History Query

```
/research-provenance .aiwg/architecture/agent-orchestration-sad.md --query history

Provenance Query: agent-orchestration-sad.md - Derivation History
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Artifact: .aiwg/architecture/agent-orchestration-sad.md

Derivation History (chronological):
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

2026-01-15T10:00:00Z - CREATION
  Activity: initial_architecture_design
  Agent: architect
  Based on:
    - .aiwg/requirements/use-cases/UC-001-orchestrator.md
    - .aiwg/intake/solution-profile.md

2026-01-20T14:30:00Z - REFINEMENT
  Activity: architecture_refinement
  Agent: architect
  Informed by:
    - REF-001-production-agentic.md (best practices)
    - Technical review feedback

2026-02-03T13:45:00Z - ENHANCEMENT
  Activity: research_integration
  Agent: architect
  Integrated findings from:
    - REF-022-autogen.md (conversable agent interface)
    - REF-057-agent-laboratory.md (HITL patterns)
  Changes:
    - Added conversable agent interface section
    - Enhanced HITL gate definitions
    - Updated agent communication patterns

Current State:
  Version: 3.0
  Last modified: 2026-02-03T13:45:00Z
  Size: 47 KB
  Sections: 12
  Referenced by: 8 artifacts
  Checksum: def456...

Provenance Chain:
  .aiwg/requirements/UC-001-orchestrator.md
    → .aiwg/architecture/agent-orchestration-sad.md (v1.0)
       ← REF-001-production-agentic.md
    → .aiwg/architecture/agent-orchestration-sad.md (v2.0)
       ← REF-022-autogen.md
       ← REF-057-agent-laboratory.md
    → .aiwg/architecture/agent-orchestration-sad.md (v3.0, current)
```

### Impact Analysis

```
/research-provenance REF-022 --query impact

Provenance Query: REF-022 - Impact Analysis
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Source: REF-022 (AutoGen: Enabling Next-Gen LLM Applications...)

Impact Analysis: What would be affected by changes to REF-022?
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Direct Dependencies (5):
  - REF-022-autogen.md (finding document) - CRITICAL
  - REF-022-assessment.yaml (quality assessment) - HIGH
  - REF-022-notes.md (literature notes) - MEDIUM
  - UC-174-conversable-agent.md (use case) - HIGH
  - .claude/rules/conversable-agent-interface.md - HIGH

Indirect Dependencies (12):
  - .aiwg/architecture/agent-orchestration-sad.md
  - src/agents/conversable-agent-interface.ts
  - test/unit/agents/conversable-agent.test.ts
  - .claude/rules/auto-reply-chains.md
  - docs/agent-framework.md
  ... (7 more)

Citation Dependencies (12 citations across 6 documents):
  - 10 citations in architecture/requirements
  - 2 citations in documentation

Implementation Dependencies (3):
  - src/agents/conversable-agent-interface.ts (implements patterns)
  - src/orchestration/conversation-manager.ts (uses patterns)
  - test/integration/multi-agent-conversation.test.ts (validates patterns)

Impact Metrics:
  Total affected artifacts: 17
  Critical dependencies: 1
  High priority dependencies: 4
  Medium priority dependencies: 3
  Citation count: 12

Risk Assessment:
  If REF-022 quality assessment changes from LOW to VERY LOW:
    - 2 citations would become violations (overclaiming)
    - 1 use case would need revision
    - 1 implementation rule would need hedging update

  If REF-022 findings are contradicted by new research:
    - 5 artifacts would require immediate review
    - 12 citations would need revalidation
    - 3 implementation patterns would need reassessment
```

## Validation

When --validate is used:

```
/research-provenance REF-022 --validate

Validating Provenance Chain: REF-022
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Validation Checks:
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

✓ All provenance records exist
✓ All referenced artifacts exist
✓ All agents are registered
✓ All activities have timestamps
✓ All derivation chains are complete
✓ No circular dependencies detected
✓ All checksums match fixity manifest
✗ 2 citation policy violations found

Issues:
  [WARNING] Citation policy violation in docs/agent-framework.md:78
    - Hedging too strong for GRADE level
    - Recommendation: Update to "Limited evidence suggests"

  [WARNING] Citation policy violation in .aiwg/architecture/adr-012-agent-protocol.md:45
    - Hedging too strong for GRADE level
    - Recommendation: Update to "Preliminary findings indicate"

Overall Status: PASS with warnings
  Critical issues: 0
  Warnings: 2
  Info: 0

Remediation:
  Run: /research-quality REF-022 --check-citations --fix
```

## GraphViz Export

Export provenance graph for visualization:

```bash
/research-provenance REF-022 --query lineage --export-dot

Output:
  Provenance graph exported to: .aiwg/research/provenance/graphs/REF-022-lineage.dot

  To visualize:
    dot -Tpng REF-022-lineage.dot -o REF-022-lineage.png
    dot -Tsvg REF-022-lineage.dot -o REF-022-lineage.svg
```

## References

- @$AIWG_ROOT/agentic/code/frameworks/research-complete/agents/provenance-agent.md - Provenance Agent
- @$AIWG_ROOT/src/research/services/provenance-service.ts - Provenance query implementation
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/schemas/provenance/prov-record.yaml - PROV schema
- @.aiwg/research/provenance/README.md - Provenance tracking
- @$AIWG_ROOT/agentic/code/frameworks/sdlc-complete/rules/provenance-tracking.md - Provenance requirements
- https://www.w3.org/TR/prov-dm/ - W3C PROV-DM specification

Related Skills

Provenance Tracking

from jmagly/aiwg

W3C PROV-O patterns for tracking media derivation chains and production history

research-workflow

from jmagly/aiwg

Execute multi-stage research workflows

research-status

from jmagly/aiwg

Show research corpus health and statistics

research-query

from jmagly/aiwg

Search the local research corpus, read matching findings, and synthesize an answer with inline citations to REF-XXX sources. The "query" operation for the research pipeline.

research-quality

from jmagly/aiwg

Assess source quality using GRADE methodology

research-quality-audit

from jmagly/aiwg

Audit research corpus for shallow stubs, incomplete sections, missing source files, and doc depth issues. Detects docs written from abstracts rather than full papers and optionally auto-dispatches expansion agents.

research-lint

from jmagly/aiwg

Run the research corpus lint ruleset to detect structural and referential integrity issues — orphan notes, missing frontmatter, broken references, missing GRADE assessments.

research-gap

from jmagly/aiwg

Analyze gaps in research coverage

research-gap-detect

from jmagly/aiwg

Build the mutual citation graph, find connected components, identify isolated clusters, and optionally search for bridge candidates and file gap issues. Automates the manual cluster analysis workflow.

research-document

from jmagly/aiwg

Generate summaries and literature notes from research papers

research-discover

from jmagly/aiwg

Search for research papers across academic databases

research-cite

from jmagly/aiwg

Generate properly formatted citation from research corpus