llm-wiki-weekly-freshness

Class-level governance workflow for keeping llm-wiki-style markdown knowledge bases current, public-safe, graph/index-valid, and useful for code development. Use when reviewing llm-wiki architecture/content, scanning new LLM concepts, maintaining public knowledge graphs, producing an issue roadmap, or running recurring freshness cadence.

5 stars

byvamseeachanta

View on GitHub Installation ↓

Best use case

llm-wiki-weekly-freshness is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using llm-wiki-weekly-freshness should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/llm-wiki-weekly-freshness/SKILL.md --create-dirs "https://raw.githubusercontent.com/vamseeachanta/workspace-hub/main/.claude/skills/research/llm-wiki-weekly-freshness/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/llm-wiki-weekly-freshness/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How llm-wiki-weekly-freshness Compares

Feature / Agent	llm-wiki-weekly-freshness	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

# LLM Wiki Weekly Freshness

## Trigger
Use this skill when the task involves any of:
- Reviewing, improving, or maintaining an `llm-wiki` or similar markdown knowledge-base repository.
- Comparing repository knowledge coverage against current LLM/software-development concepts.
- Creating GitHub issues to keep the wiki useful for engineering/code development.
- Designing or running a weekly freshness cadence for LLM knowledge ingestion.
- Changing public-safe graph/index generators, validators, schema docs, CSV/JSONL mirrors, reports, or source-scope boundaries.
- Closing out llm-wiki maintenance work where generated artifacts, validation evidence, issue state, and commits must align.

## Operating contract
1. **Treat this as planning/governance unless the user has approved implementation.**
   - For GitHub issue work, follow the local repo's planning gates.
   - Do not jump from gap discovery directly into implementation unless the relevant issue is already plan-approved.
2. **Keep the wiki public-safe by default.**
   - No private/raw archives, credentials, local absolute paths, vendor/client material, dotfiles, symlink escapes, connection strings, or machine-specific manifests.
   - Prefer committed markdown and deterministic generated metadata over local runtime state.
3. **Optimize for code-development leverage.**
   - Rank gaps by how much they improve implementation decisions, architecture review, testing strategy, agent prompts, retrieval, and issue planning.
   - Avoid generic news collection unless it maps to code-development utility.
4. **Keep generator, validator, tests, schema docs, and reports in lockstep.**
   - A graph/index change is incomplete until all five surfaces agree.
   - Validate parity between JSONL, CSV, summaries, reports, and documented schema, not just row counts.

## Weekly workflow

### 1. Repo architecture review
Inspect the repo structure and identify:
- Domain taxonomy and whether it matches current engineering/code workflows.
- Markdown source layout and generated artifacts.
- Existing ingestion/index/validation scripts.
- Reports or manifests used by agents for retrieval.
- Public-safety boundaries for generated graphs, indexes, reports, and issue artifacts.

Evidence to gather:
- `git status --short`
- `git remote -v`
- key docs/README/schema files
- current issue labels and open freshness/indexing issues
- test and validation entry points
- generated artifact/report paths and schema version strings

### 2. Current LLM concept scan
Review current concepts from high-signal sources, then convert only durable items into wiki work:
- model release notes and provider docs
- evaluation/benchmarking practices
- agentic coding workflows
- retrieval/context engineering
- structured outputs/tool calling
- inference/runtime/serving changes
- security, prompt-injection, data-boundary patterns

For each concept, record:
- concept name
- why it matters for code development
- target wiki domain/page
- source URL/citation
- freshness date
- proposed validation or example artifact

### 3. Code-development usefulness mapping
For each candidate gap, ask what repo decision, code pattern, evaluation, documentation contract, automation pipeline, or agent workflow it would improve. Prioritize items that improve engineering velocity, correctness, agent routing, reusable architecture guidance, retrieval quality, or public-safety guarantees.

### 4. Gap-to-issue conversion
Open issues only when the gap is actionable. Each issue should include:
- Problem statement tied to code-development leverage.
- Resource intel: repo files, external sources, related issues.
- Plan shape: expected files/artifacts, tests, validation.
- Public-safety constraints.
- Acceptance criteria.
- Labels for planning status and domain.

Prefer issue classes:
- **Taxonomy/schema gaps** — missing domains, stale page schema, weak metadata.
- **Freshness automation** — weekly scanner, source manifest, stale-page report.
- **Retrieval utility** — graph manifests, cross-links, query surfaces, agent context exports.
- **Concept coverage** — current LLM concepts mapped to durable wiki pages.
- **Validation/legal gates** — public-safe checks, link validation, artifact consistency.
- **Public graph/index hardening** — schema/validator/generator/report parity, source-scope boundaries, leakage guards, stale artifact detection.

### 5. Validation before closeout
For any repo change or generated artifact, run the relevant verification loop:
- targeted tests for changed scripts/pages
- artifact generation + validator
- full test suite when code changed
- legal/public-safety scan if artifacts or public content changed
- adversarial review for non-trivial planning or implementation

Do not close issues when adversarial review has unresolved MAJOR findings.

If closeout cannot finish, preserve restart state in a repo-tracked handoff with exact validation evidence, dirty files, issue status, and the next checkpoint. Do not lose partial closeout state after reports were generated but commits/pushes/issue closures are unfinished.

## Public-safe graph/index governance
Use this subsection when a repository emits public-safe graph manifests such as `nodes.jsonl`, `edges.jsonl`, CSV mirrors, summary metadata, or Markdown reports derived from an llm-wiki-style corpus.

### Provenance and schema fields
Recommended node fields:
- `schema_version`
- `node_id`
- `path`
- `title`
- `domain`
- `kind`
- `tags`
- `public_safe`
- `source_scope`
- `source_family`
- `source_corpus_digest`
- `backlinks`

Recommended edge fields:
- `schema_version`
- `edge_id`
- `source_node`
- `relation`
- `target_node`
- `evidence_path`
- `evidence`
- `source_scope`
- `source_family`
- `source_corpus_digest`

Use stable, repo-relative paths. `source_scope` should be explicit and bounded, for example `public-wiki`; do not imply that private raw data, client data, or unreviewed readable-raw corpora are part of the public graph.

### Relation allowlist discipline
Treat relation names as public schema. Remove provisional workflow/control relations from public v1 artifacts unless they are intentionally documented and tested. If reviewers reject relation names, update validator, schema docs, generator tests, fixture artifacts, and report text together.

### Edge-field leakage guard
Do not treat public-safe filtering as complete just because excluded files were not emitted as nodes. Edge fields can still leak excluded surfaces through `target_ref`, `evidence_path`, `evidence_locator`, CSV mirrors, summaries, or reports.

Fail closed when schema says a target class is excluded:
- If unresolved targets are documented as dropped, `target_layer == "unresolved"` must not be an accepted validator target layer and must not appear in generated artifacts.
- Agent instruction surfaces such as `CLAUDE.md` and `AGENTS.md` must be rejected anywhere in public artifacts, not only excluded from source discovery.
- High-risk code/result relations need direct negative coverage for forged curated evidence and malformed curated lines with zero or multiple external URLs.

### Backlinks
If graph consumers need reverse navigation, compute backlinks from emitted edges rather than hand-maintaining them in source pages:
1. Initialize backlinks for all emitted nodes.
2. For each edge whose target is a known node, add the source node unless it is self-referential.
3. Sort backlink arrays before writing artifacts for deterministic diffs.
4. Validate CSV mirrors against JSONL after serialization, not independently.

### Default freshness validation
Weekly graph validation should detect stale artifacts by default. Do not make current-corpus validation opt-in unless there is a deliberate test-only escape hatch.

Recommended behavior:
1. Resolve the repository root explicitly from CLI input or infer it from the artifact/report path and cwd.
2. Rebuild an in-memory graph from the current public corpus.
3. Compare summary counts, corpus digest, schema version, and key output parity against the checked-in artifacts.
4. Fail closed if artifacts are stale, missing, or generated from a different public-safe source set.

### TDD and artifact update workflow
1. Write or update failing tests first for the intended contract drift.
   - Prefer contract-targeted RED tests over incidental corpus-shape assertions.
   - For graph hardening, test unresolved target rejection/sanitization, private/source leakage across every output surface, CSV-vs-JSONL parity, and schema/generator alignment directly.
   - Avoid bare exact node/edge counts unless the fixture is deliberately minimal and every created page is part of the asserted contract.
2. Patch the generator.
3. Patch the validator immediately after the generator; do not leave CSV headers or allowed relation lists stale.
4. Patch schema docs and report text in the same change.
5. Regenerate artifacts if the repository tracks them.
6. Normalize artifact/report paths before validation: pick one final report date/path, regenerate once, and stage only artifacts validated against that exact report.
7. Run targeted tests, full tests, artifact validator, legal/safety scan for public/private leakage, and adversarial review before closeout.

### Closeout hygiene for tracked artifacts
Public graph work often produces generated files plus dated reports. Before committing:
- Treat `AM` status on generated artifacts as a warning: reset/re-stage final versions so staged content matches the validated working tree.
- Do not validate one dated report while staging another dated report; the validator command, summary `run_date`, report filename, and committed report must align.
- Keep transient review scratch directories such as `.planning/quick/` out of commits unless the repository explicitly tracks review artifacts.
- Re-run the artifact validator after final staging normalization, not only after the first generation.

## Weekly report format
Produce a compact report with:
1. Current state.
2. Evidence inspected.
3. New concept signals.
4. Repo architecture gaps.
5. Public graph/index health, if applicable.
6. Recommended issue backlog ranked by leverage.
7. Automation/freshness cadence proposal.
8. Blockers/risks.
9. Exact next action.

## Issue quality checklist
A good llm-wiki maintenance issue includes:
- Clear problem statement tied to code-development leverage.
- Current evidence path: repo files, generated reports, graph metrics, issue links, or source pages.
- Scope boundaries: what is in/out for this issue.
- Acceptance criteria with validation commands or report artifacts.
- Public/private data classification where source material may cross governance boundaries.
- Dependencies on existing architecture contracts, graph schema, citation model, or source-ingest workflows.

## Pitfalls
- Do not create a flat backlog of generic “add topic X” issues. Tie each issue to a reusable architecture or development outcome.
- Do not treat external AI trends as authoritative without source provenance and update dates.
- Do not bypass issue planning gates just because the task is documentation-heavy.
- Do not close an llm-wiki issue after only generating reports; verify that committed files, pushed state, issue labels/comments, and tests all match the claimed closeout.
- Do not let public-safe filtering stop at source discovery; validate every emitted field and every mirror/report surface.
- Do not let generated artifact path drift invalidate closeout evidence.

## References
- `references/public-safe-graph-validation.md` — session-derived checks for public-safe graph manifests and adversarial-review blockers.
- `references/issue-closeout-handoff-pattern.md` — restart-handoff pattern from an llm-wiki public-graph issue closeout where validation passed but implementation files still needed final commit and issue closure.
- `references/session-2026-05-unresolved-agent-edge-leakage.md` — concrete edge-field leakage patch and validation pattern.
- `references/session-2026-05-public-graph-review-majors.md` — review-major remediation pattern for public graph artifacts.
- `references/session-2026-05-public-graph-closeout-hygiene.md` — artifact/report date-drift and staging-normalization closeout pattern.
- `references/session-2026-05-graph-hardening-red-test-targets.md` — RED-test targeting pattern for direct contract violations.
- `references/absorbed-llm-wiki-cadence-governance.md` — archived original narrow cadence-governance skill body for traceability.
- `references/absorbed-public-knowledge-graph-governance.md` — archived original public graph governance skill body for traceability.

Related Skills

llm-wiki-source-extraction-coverage

from vamseeachanta/workspace-hub

Doc-type-aware extraction contract for llm-wiki source ingestion with measurable coverage and source-anchored traceability. Use when (1) ingesting a PDF, DOCX, XLSX, PPTX, HTML, or scanned-image source into a wiki `sources/` page, (2) computing the pre-extraction estimate (what fraction of the source we expect to recover) and post-extraction yield (what fraction we actually recovered), (3) anchoring wiki claims back to specific page / paragraph / cell / slide positions in the source so a reviewer can re-verify or revise against the actual document, (4) deciding whether OCR fallback or manual transcription is needed. Codifies workspace-hub's existing OCR fallback chain and python-docx / openpyxl / trafilatura patterns into a format-specific routing table. Companion to research/llm-wiki-page-shape-contract (Rule 7 input-layer pages) and research/llm-wiki — this skill is the defense against silent extraction failure.

llm-wiki-public-private-routing

from vamseeachanta/workspace-hub

Firewall between the public llm-wiki repo (vamseeachanta/llm-wiki, MIT + CC-BY-4.0) and per-client private wikis (vamseeachanta/llm-wiki-<client>, e.g. llm-wiki-mkt-a per #2746). Use when (1) deciding whether a converted wiki page lands in public or private surface, (2) applying the project-name abstraction rule to public-bound content, (3) evaluating the public- availability exception that lets actual project names pass through unmodified, (4) promoting content from private to public after sanitization. Encodes the 2026-05-20 user routing directive verbatim: exact client results → private; abstracted (project-name only) → public; project name + all key data publicly available → exception applies. Companion to research/llm-wiki-page-shape-contract (which calls this skill at Rule 8) and research/llm-wiki-source-extraction-coverage (which produces the source pages this skill decides where to send).

llm-wiki-page-shape-contract

from vamseeachanta/workspace-hub

Enforce the page-shape contract when a repo-side document or analysis output gets converted into an llm-wiki page. Use when (1) running `scripts/knowledge/llm_wiki.py ingest`, (2) writing or rewriting a wiki page from docs/reports/*, docs/handoffs/*, scripts/review/results/*, or calc citation outputs, (3) deciding whether a page should be split into a folder of sub-pages, (4) reviewing wiki PRs for length / diagram / divide-and-conquer compliance. Codifies the Karpathy + Astro-Han + lewislulu page rules applied to workspace-hub's domain-wiki layout under /mnt/local-analysis/llm-wiki/wikis/<domain>/. Sibling to research/llm-wiki (which owns the CLI ops) — this skill is the quality gate every converted page must clear before commit.

llm-wiki-cadence-governance

from vamseeachanta/workspace-hub

Weekly governance workflow for keeping an llm-wiki repository current, code-development-useful, and connected to actionable GitHub issue planning.

llm-wiki-audit-feedback-loop

from vamseeachanta/workspace-hub

Durable feedback loop for correcting llm-wiki pages without losing the correction to chat history. Use when (1) a human notices a wiki page is wrong, outdated, or contradicts a source, (2) processing the `audit/` inbox of a domain wiki, (3) reviewing what feedback has been resolved vs deferred, (4) needing to leave a comment on a specific text range that survives line- number drift. Implements the anchored-text audit file pattern from lewislulu/llm-wiki-skill, adapted for workspace-hub's domain-wiki layout under /mnt/local-analysis/llm-wiki/wikis/<domain>/. Extends the 5-op model (compile/ingest/query/lint) from research/llm-wiki with the missing `audit` op. Never silently delete feedback — rejected audits stay archived with rejection rationale.

oss-wiki-development-arc

from vamseeachanta/workspace-hub

Three-phase methodology (Substrate → Depth → Quality) for building open-source engineering wikis efficiently. Skip 70%+ of empirical iteration cost by pre-loading the pattern.

client-llm-wiki-factory

from vamseeachanta/workspace-hub

Operator checklist for instantiating a new per-client private llm-wiki repo under workspace-hub [#2746](https://github.com/vamseeachanta/workspace-hub/issues/2746) + [#2731](https://github.com/vamseeachanta/workspace-hub/issues/2731) D4 (amended) naming convention `llm-wiki-<client>`.

metadata-only-wiki-sweep-workflow

from vamseeachanta/workspace-hub

Disciplined inventory process for cataloging documents by filename/path without content claims, using parent-centric grouping to prevent stub proliferation

exclude-wiki-Codex-md-from-harness-line-limit-hook

from vamseeachanta/workspace-hub

Fix false-positive pre-commit failures where workspace-hub's AGENTS.md line-limit hook blocks edits to auto-generated wiki schema files under knowledge/wikis/.

repair-legacy-llm-wiki-frontmatter-dates

from vamseeachanta/workspace-hub

Diagnose and repair legacy llm-wiki source pages that have ingested timestamps but are missing required added/last_updated frontmatter dates.

parallel-llm-wiki-gap-to-issues

from vamseeachanta/workspace-hub

Use parallel subagents to mine remaining LLM-wiki/document-intelligence gaps, de-duplicate against existing GitHub issues, then create only the strongest bounded follow-on issues.

llm-wiki-ecosystem-gap-to-issues

from vamseeachanta/workspace-hub

Review the workspace-hub LLM-wiki/document-intelligence ecosystem, identify high-leverage gaps, and create grounded GitHub feature issues without duplicating existing work.