building-github-index

Generate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".

16 stars

Best use case

building-github-index is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".

Teams using building-github-index should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/building-github-index/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/tools/building-github-index/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/building-github-index/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How building-github-index Compares

Feature / Agentbuilding-github-indexStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Generate progressive disclosure indexes for GitHub repositories to use as Claude project knowledge. Use when setting up projects referencing external documentation, creating searchable indexes of technical blogs or knowledge bases, combining multiple repos into one index, or when user mentions "index", "github repo", "project knowledge", or "documentation reference".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Building GitHub Index

Create markdown indexes of GitHub repositories optimized for Claude project knowledge. Indexes enable retrieval via GitHub API with semantic descriptions for effective matching.

## Quick Start

```bash
# Documentation repos (markdown/notebooks)
python scripts/github_index.py owner/repo -o index.md

# Code repos (extract symbols via tree-sitter)
python scripts/github_index.py owner/repo --code-symbols -o index.md

# Multiple repos combined
python scripts/github_index.py owner/repo1 owner/repo2 -o combined.md
```

## Script Options

| Flag | Description |
|------|-------------|
| `-o, --output` | Output file (default: `github_index.md`) |
| `--token` | GitHub PAT; also reads `GITHUB_TOKEN` env |
| `--include-patterns` | Only index matching globs: `"docs/**" "src/**"` |
| `--exclude-patterns` | Skip matching globs: `"test/**"` |
| `--max-files` | Cap files per repo (default: 200) |
| `--skip-fetch` | Tree only, no content fetch (fast, filename-only descriptions) |
| `--code-symbols` | Include code files, extract function/class names via tree-sitter |

## Description Extraction Priority

1. **YAML frontmatter** - `title:` and `description:` fields
2. **Markdown headings** - First h1/h2 as title, subsequent as topics
3. **Notebook cells** - First markdown cell heading
4. **Code symbols** - Public function/class names (with `--code-symbols`)
5. **Path-derived** - Convert filename to words (fallback)

## When Descriptions Fail

Some repos have stub files (links to external docs, empty readmes). In these cases:

**Manual curation recommended.** Use the tree output and domain knowledge:

```bash
# Get tree structure only (fast)
python scripts/github_index.py owner/repo --skip-fetch -o skeleton.md
# Then manually enhance descriptions based on domain knowledge
```

For code-heavy repos with embedded apps:
- Directory names encode purpose: `acc_wav_gen` → "ACC waveform generation"
- Peripheral acronyms map to functions: AFEC=ADC, MCAN=CAN, TWIHS=I2C
- Operation modes: blocking, interrupt, dma, polled

## Output Format

```markdown
# {Repo} - Content Index

**Repository:** {url}
**Branch:** `{branch}`

## Retrieval Method
{API curl commands}

---

## {Category}

| Description | Path |
|-------------|------|
| {What this covers} | `{path/file.md}` |
```

Description column leads (relevance matching), path follows (retrieval key).

## API Access

Enumerate files:
```bash
curl -sL "https://api.github.com/repos/OWNER/REPO/git/trees/BRANCH?recursive=1"
```

Fetch content:
```bash
curl -s "https://api.github.com/repos/OWNER/REPO/contents/PATH?ref=BRANCH" \
  -H "Accept: application/vnd.github+json" | \
  python3 -c "import sys,json,base64; print(base64.b64decode(json.load(sys.stdin)['content']).decode())"
```

## Network

Both scripts download a repo tarball (single HTTP request, no per-file rate limits) then process files locally. Allowlist: `api.github.com` (tarball redirects via this endpoint)

## Related Skills

- `accessing-github-repos` - Private repos, PAT setup, tarball download
- `mapping-codebases` - Detailed code structure (methods, imports, line numbers)

## Condensed Format (pk_index.py)

For token-constrained project knowledge, use the condensed script:

```bash
python scripts/pk_index.py owner/repo -o repo_pk.md
```

Produces ~80% smaller output:
- Single line per file: `path` — description
- Symbols only (no signatures)
- 15 files max per category
- No retrieval instructions section

Ideal when adding multiple repo indexes to project knowledge.

Related Skills

building-ui

16
from diegosouzapw/awesome-omni-skill

Complete guide for building beautiful apps with Expo Router. Covers fundamentals, styling, components, navigation, animations, patterns, and native tabs.

building-mechanics

16
from diegosouzapw/awesome-omni-skill

Three.js 3D building system with spatial indexing, structural physics, and multiplayer networking. Use when creating survival games, sandbox builders, or any game with player-constructed structures. Covers performance optimization (spatial hash grids, octrees, chunk loading), structural validation (arcade/heuristic/realistic physics modes), and multiplayer sync (delta compression, client prediction, conflict resolution).

building-ai-chat

16
from diegosouzapw/awesome-omni-skill

Builds AI chat interfaces and conversational UI with streaming responses, context management, and multi-modal support. Use when creating ChatGPT-style interfaces, AI assistants, code copilots, or conversational agents. Handles streaming text, token limits, regeneration, feedback loops, tool usage visualization, and AI-specific error patterns. Provides battle-tested components from leading AI products with accessibility and performance built in.

bio-alignment-indexing

16
from diegosouzapw/awesome-omni-skill

Create and use BAI/CSI indices for BAM/CRAM files using samtools and pysam. Use when enabling random access to alignment files or fetching specific genomic regions.

address-github-comments

16
from diegosouzapw/awesome-omni-skill

Use when you need to address review or issue comments on an open GitHub Pull Request using the gh CLI.

accessing-github

16
from diegosouzapw/awesome-omni-skill

Performs git operations and interacts with GitHub. Used when git-related operations are needed, including: git commands (commit, push, pull, branch, merge, etc.), GitHub API interactions, gh CLI operations, pull request operations, issue analysis and operations.

3d-building-mechanics

16
from diegosouzapw/awesome-omni-skill

Complete Three.js building system with spatial indexing, structural physics, and multiplayer networking. Use for survival/crafting games, sandbox games, multiplayer construction, or any 3D building mechanics.

aboutme-index

16
from diegosouzapw/awesome-omni-skill

Index-based file discovery using ABOUTME headers. Use INSTEAD of grep or Explore agent when searching for files by purpose or feature. Faster and more accurate than scanning code. Invoke this skill when user asks "which files handle X", "where is Y implemented", or when you need to find files related to a feature or task.

reindex-docs

16
from diegosouzapw/awesome-omni-skill

Re-index all PDF and HTML documents, update index.html, and commit/push changes to the repository

culture-index

16
from diegosouzapw/awesome-omni-skill

Index and search culture documentation

github-workflow-authoring

16
from diegosouzapw/awesome-omni-skill

This skill should be used when creating or improving GitHub Actions CI/CD workflows for Breenix kernel development. Use for authoring new test workflows, optimizing existing CI pipelines, adding new test types, fixing workflow configuration issues, or adapting workflows for new kernel features.

github-actions

16
from diegosouzapw/awesome-omni-skill

Debug, optimize, and secure GitHub Actions workflows. Use this skill when writing CI/CD pipelines, fixing failing workflows, or improving build times.