public-knowledge-graph-governance

Maintain public-safe knowledge graph artifacts for llm-wiki and similar markdown knowledge bases. Use when changing graph generators, validators, schema docs, weekly freshness checks, or public/private source-scope boundaries.

5 stars

byvamseeachanta

View on GitHub Installation ↓

Best use case

public-knowledge-graph-governance is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using public-knowledge-graph-governance should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/public-knowledge-graph-governance/SKILL.md --create-dirs "https://raw.githubusercontent.com/vamseeachanta/workspace-hub/main/.claude/skills/research/public-knowledge-graph-governance/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/public-knowledge-graph-governance/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How public-knowledge-graph-governance Compares

Feature / Agent	public-knowledge-graph-governance	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Public Knowledge Graph Governance

Use this skill when a repository emits public-safe graph manifests such as `nodes.jsonl`, `edges.jsonl`, CSV mirrors, summary metadata, or Markdown reports derived from an llm-wiki-style corpus.

## Core Contract

Keep generator, validator, tests, schema docs, and report text in lockstep. A graph change is not complete until all five surfaces agree:

1. Generator output fields and allowed relation behavior.
2. Validator required fields, allowed relations, CSV parity, and freshness checks.
3. Tests for both accepted artifacts and rejection cases.
4. Schema documentation for each JSONL/CSV/report field.
5. Generated artifacts/reports validated against the current corpus.

## Public-safe Provenance Fields

For public graph artifacts, each node and edge should carry bounded provenance that proves source scope without exposing private/raw corpora.

Recommended node fields:

- `schema_version`
- `node_id`
- `path`
- `title`
- `domain`
- `kind`
- `tags`
- `public_safe`
- `source_scope`
- `source_family`
- `source_corpus_digest`
- `backlinks`

Recommended edge fields:

- `schema_version`
- `edge_id`
- `source_node`
- `relation`
- `target_node`
- `evidence_path`
- `evidence`
- `source_scope`
- `source_family`
- `source_corpus_digest`

Use stable, repo-relative paths. `source_scope` should be explicit and bounded, for example `public-wiki`; do not imply that private raw data, client data, or unreviewed readable-raw corpora are part of the public graph.

## Relation Allowlist Discipline

Treat relation names as part of the public schema. Remove provisional workflow/control relations from v1 public artifacts unless they are intentionally documented and tested.

Common pitfall: leaving implementation-stage relation names such as `blocked-by-clearance` in a public v1 allowlist after reviewers reject them. Fix the validator, schema doc, generator tests, and any fixture artifacts together.

## Edge-Field Leakage Guard

Do not treat public-safe filtering as complete just because excluded files were not emitted as nodes. Edge fields can still leak excluded surfaces through `target_ref`, `evidence_path`, `evidence_locator`, CSV mirrors, summaries, or reports.

Fail closed when schema says a target class is excluded:

- If unresolved targets are documented as dropped, `target_layer == "unresolved"` must not be an accepted validator target layer and must not appear in generated artifacts.
- Agent instruction surfaces such as `CLAUDE.md` and `AGENTS.md` must be rejected anywhere in public artifacts, not only excluded from source discovery.
- High-risk code/result relations need direct negative coverage for forged curated evidence and malformed curated lines with zero or multiple external URLs.

See `references/session-2026-05-unresolved-agent-edge-leakage.md` for the concrete patch and validation pattern.

## Backlinks

If graph consumers need reverse navigation, compute backlinks from emitted edges rather than hand-maintaining them in source pages.

Rules:

1. Initialize backlinks for all emitted nodes.
2. For each edge whose target is a known node, add the source node unless it is self-referential.
3. Sort backlink arrays before writing artifacts for deterministic diffs.
4. Validate CSV mirrors against JSONL after serialization, not independently.

## Default Freshness Validation

Weekly graph validation should detect stale artifacts by default. Do not make current-corpus validation opt-in unless there is a deliberate test-only escape hatch.

Recommended behavior:

1. Resolve the repository root explicitly from CLI input or infer it from the artifact/report path and cwd.
2. Rebuild an in-memory graph from the current public corpus.
3. Compare summary counts, corpus digest, schema version, and key output parity against the checked-in artifacts.
4. Fail closed if artifacts are stale, missing, or generated from a different public-safe source set.

## Update Workflow

1. Write or update failing tests first for the intended contract drift.
   - Prefer contract-targeted RED tests over incidental corpus-shape assertions. For graph hardening, test unresolved target rejection/sanitization, private/source leakage across every output surface, CSV-vs-JSONL parity, and schema/generator alignment directly.
   - Avoid bare exact node/edge counts unless the fixture is deliberately minimal and every created page is part of the asserted contract.
2. Patch the generator.
3. Patch the validator immediately after the generator; do not leave CSV headers or allowed relation lists stale.
4. Patch schema docs and report text in the same change.
5. Regenerate artifacts if the repository tracks them.
6. Normalize artifact/report paths before validation: pick one final report date/path, regenerate once, and stage only artifacts validated against that exact report.
7. Run targeted tests, full tests, artifact validator, legal/safety scan for public/private leakage, and adversarial review before closeout.

## Closeout Hygiene for Tracked Artifacts

Public graph work often produces generated files plus dated reports. Before committing:

- Treat `AM` status on generated artifacts as a warning: reset/re-stage final versions so staged content matches the validated working tree.
- Do not validate one dated report while staging another dated report; the validator command, summary `run_date`, report filename, and committed report must align.
- Keep transient review scratch directories such as `.planning/quick/` out of commits unless the repository explicitly tracks review artifacts.
- Re-run the artifact validator after final staging normalization, not only after the first generation.

## Reference

- `references/session-2026-05-public-graph-review-majors.md` captures the review-major remediation pattern that motivated this skill.
- `references/session-2026-05-public-graph-closeout-hygiene.md` captures the artifact/report date-drift and staging-normalization closeout pattern.
- `references/session-2026-05-graph-hardening-red-test-targets.md` captures the RED-test targeting pattern: test contract violations directly instead of incidental graph counts.

Related Skills

domain-knowledge-sweep

from vamseeachanta/workspace-hub

Systematic multi-source research of an engineering domain. Spawns parent issue → 6 research subissues (Standards, Academic, Industry, LinkedIn-marketing, Code-audit, Synthesis) → gap implementation subissues. Replaces LinkedIn-only extraction with defensible comprehensive sourcing.

llm-wiki-public-private-routing

from vamseeachanta/workspace-hub

Firewall between the public llm-wiki repo (vamseeachanta/llm-wiki, MIT + CC-BY-4.0) and per-client private wikis (vamseeachanta/llm-wiki-<client>, e.g. llm-wiki-mkt-a per #2746). Use when (1) deciding whether a converted wiki page lands in public or private surface, (2) applying the project-name abstraction rule to public-bound content, (3) evaluating the public- availability exception that lets actual project names pass through unmodified, (4) promoting content from private to public after sanitization. Encodes the 2026-05-20 user routing directive verbatim: exact client results → private; abstracted (project-name only) → public; project name + all key data publicly available → exception applies. Companion to research/llm-wiki-page-shape-contract (which calls this skill at Rule 8) and research/llm-wiki-source-extraction-coverage (which produces the source pages this skill decides where to send).

llm-wiki-cadence-governance

from vamseeachanta/workspace-hub

Weekly governance workflow for keeping an llm-wiki repository current, code-development-useful, and connected to actionable GitHub issue planning.

workspace-knowledge-doc-contracts

from vamseeachanta/workspace-hub

Class-level workspace knowledge, LLM-wiki, repo mission contracts, stale doc references, semantic taxonomy, and knowledge-source reconnaissance.

split-plan-governance-vs-substance

from vamseeachanta/workspace-hub

When repeated adversarial plan reviews keep returning MAJOR findings about validator mechanics, evidence bookkeeping, or governance traceability rather than the core product or mission decision, stop tightening the monolithic plan and split it into a content/decision packet and a tooling/enforcement packet.

plan-governance-vs-execution-boundary-for-adversarial-review

from vamseeachanta/workspace-hub

Keep stale-approval/governance remediation out of execution-path pseudocode, TDD, files-to-change, and deliverable acceptance when hardening a GitHub issue plan under adversarial review.

knowledge-pipeline

from vamseeachanta/workspace-hub

Workflow for maintaining workspace-hub knowledge and learning pipelines across scripts/knowledge, scripts/learning, and docs/superpowers, including indexing, archive synthesis, issue updates, and pipeline troubleshooting.

knowledge-base-builder

from vamseeachanta/workspace-hub

Build searchable knowledge bases from document collections (PDFs, Word, text files). Use for creating technical libraries, standards repositories, research databases, or any large document collection requiring full-text search.

baoyu-infographic

from vamseeachanta/workspace-hub

Infographics: 21 layouts x 21 styles (信息图, 可视化).

roadmap-anchor-issue-governance

from vamseeachanta/workspace-hub

Reopen and reuse the original roadmap/umbrella issue as the governance anchor; create new issues only for genuinely missing scoped work, then retarget/link all execution issues back to the anchor.

plan-review-queue-governance-cleanup

from vamseeachanta/workspace-hub

Clean a stale `status:plan-review` queue by separating missing-plan drift from real review blockers, posting governance comments, removing stale labels, and materializing durable review artifacts for delegated/Hermes reviews.

plan-exit-governance-drift-handoff

from vamseeachanta/workspace-hub

When ending a session on an iterated plan draft, audit and document approval-state drift across GitHub labels, local approval markers, README status, and latest review verdicts.