skill-forge

Ultimate Claude Code skill creator and architect. Designs, scaffolds, builds, reviews, evolves, and publishes production-grade Claude Code skills following the Agent Skills open standard and 3-layer architecture (directive, orchestration, execution). Handles single-file skills, multi-skill orchestrators with sub-skills and subagents, MCP-enhanced workflows, and full skill ecosystems. Industry detection for skill domain. Triggers on: "create skill", "build skill", "new skill", "skill creator", "skill builder", "skill-forge", "design skill", "scaffold skill", "review skill", "improve skill", "publish skill", "skill architecture", "convert skill", "port skill", "multi-platform", "cross-platform", "eval skill", "test skill", "benchmark skill", "skill evals", "measure skill", "skill performance", "skill A/B test".

39 stars

Best use case

skill-forge is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Ultimate Claude Code skill creator and architect. Designs, scaffolds, builds, reviews, evolves, and publishes production-grade Claude Code skills following the Agent Skills open standard and 3-layer architecture (directive, orchestration, execution). Handles single-file skills, multi-skill orchestrators with sub-skills and subagents, MCP-enhanced workflows, and full skill ecosystems. Industry detection for skill domain. Triggers on: "create skill", "build skill", "new skill", "skill creator", "skill builder", "skill-forge", "design skill", "scaffold skill", "review skill", "improve skill", "publish skill", "skill architecture", "convert skill", "port skill", "multi-platform", "cross-platform", "eval skill", "test skill", "benchmark skill", "skill evals", "measure skill", "skill performance", "skill A/B test".

Teams using skill-forge should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/skill-forge/SKILL.md --create-dirs "https://raw.githubusercontent.com/AgriciDaniel/skill-forge/main/skill-forge/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/skill-forge/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How skill-forge Compares

Feature / Agentskill-forgeStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Ultimate Claude Code skill creator and architect. Designs, scaffolds, builds, reviews, evolves, and publishes production-grade Claude Code skills following the Agent Skills open standard and 3-layer architecture (directive, orchestration, execution). Handles single-file skills, multi-skill orchestrators with sub-skills and subagents, MCP-enhanced workflows, and full skill ecosystems. Industry detection for skill domain. Triggers on: "create skill", "build skill", "new skill", "skill creator", "skill builder", "skill-forge", "design skill", "scaffold skill", "review skill", "improve skill", "publish skill", "skill architecture", "convert skill", "port skill", "multi-platform", "cross-platform", "eval skill", "test skill", "benchmark skill", "skill evals", "measure skill", "skill performance", "skill A/B test".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Skill Forge — Ultimate Claude Code Skill Creator

Build production-grade Claude Code skills following the Agent Skills open standard,
progressive disclosure architecture, and battle-tested patterns from high-performing
skills like claude-seo and claude-ads.

## Quick Reference

| Command | What it does |
|---------|-------------|
| `/skill-forge` | Interactive skill creation wizard |
| `/skill-forge plan <domain>` | Architecture and design planning |
| `/skill-forge build <name>` | Scaffold and build a skill from plan |
| `/skill-forge review <path>` | Audit an existing skill for quality |
| `/skill-forge evolve <path>` | Improve skill based on feedback/issues |
| `/skill-forge eval <path>` | Run eval pipeline to test skill quality |
| `/skill-forge benchmark <path>` | Benchmark skill with variance analysis |
| `/skill-forge publish <path>` | Package and prepare for distribution |
| `/skill-forge convert <path>` | Convert skill to Codex/Gemini/Antigravity/Cursor |

## Orchestration Logic

### Interactive Mode (`/skill-forge`)

Walk the user through the full skill creation lifecycle:

1. **Discovery**: Ask about the domain, use cases, and target users
2. **Architecture**: Determine skill complexity tier and design structure
3. **Build**: Generate all files following chosen template
4. **Review**: Validate structure, frontmatter, triggers, and quality
5. **Eval**: Run eval pipeline with assertions and grading
6. **Benchmark**: Measure pass rate, time, tokens with variance analysis
7. **Iterate**: Refine based on eval results and feedback

### Command Routing

For specific commands, load the relevant sub-skill:
- `/skill-forge plan` -> `skills/skill-forge-plan/SKILL.md`
- `/skill-forge build` -> `skills/skill-forge-build/SKILL.md`
- `/skill-forge review` -> `skills/skill-forge-review/SKILL.md`
- `/skill-forge evolve` -> `skills/skill-forge-evolve/SKILL.md`
- `/skill-forge eval` -> `skills/skill-forge-eval/SKILL.md`
- `/skill-forge benchmark` -> `skills/skill-forge-benchmark/SKILL.md`
- `/skill-forge publish` -> `skills/skill-forge-publish/SKILL.md`
- `/skill-forge convert` -> `skills/skill-forge-convert/SKILL.md`

## Skill Complexity Tiers

Detect the appropriate tier based on user's description:

### Tier 1: Single Skill (1 SKILL.md)
- Simple workflow or document generation
- No sub-skills or subagents needed
- Under 200 lines of instructions
- **Template**: `assets/templates/minimal.md`

### Tier 2: Skill + Scripts (SKILL.md + scripts/)
- Needs deterministic execution (validation, data processing)
- Python/Bash scripts for fragile operations
- **Template**: `assets/templates/workflow.md`

### Tier 3: Multi-Skill Orchestrator (main + sub-skills)
- Complex domain with multiple distinct workflows
- Main skill routes to specialized sub-skills
- Shared references across sub-skills
- **Template**: `assets/templates/multi-skill.md`

### Tier 4: Full Ecosystem (orchestrator + sub-skills + agents + scripts)
- Enterprise-grade skill with parallel subagent delegation
- Multiple execution scripts for deterministic tasks
- Industry templates and reference knowledge
- **Template**: `assets/templates/ecosystem.md`

## Core Principles (Enforce in ALL generated skills)

### 1. Progressive Disclosure (3 Levels)
- **Level 1 (frontmatter)**: Always in system prompt. Name + description only (~50-100 tokens)
- **Level 2 (SKILL.md body)**: Loaded on activation. Core instructions (<500 lines, <5000 tokens)
- **Level 3 (references/scripts/assets)**: Loaded on-demand. Detailed knowledge and execution

### 2. Description is King
The `description` field determines when the skill activates. It MUST contain:
- WHAT the skill does (capabilities)
- WHEN to use it (trigger phrases users would say)
- Key domain keywords for matching

Read `references/description-guide.md` for the complete framework.

### 3. The 3-Layer Architecture
- **Layer 1 (Directive)**: SKILL.md instructions, reference files = the "what"
- **Layer 2 (Orchestration)**: Claude's routing and decision-making = the "how"
- **Layer 3 (Execution)**: Scripts in scripts/ = the "do"

Push deterministic work into scripts. Keep probabilistic decisions in instructions.

### 4. Naming Conventions
- Skill folder: `kebab-case` (lowercase + hyphens only)
- Name field must match folder name exactly
- Sub-skills: `{parent}-{child}` (e.g., `seo-audit`, `ads-google`)
- Agents: `agents/{skill}-{role}.md` (e.g., `agents/seo-technical.md`)
- No "claude" or "anthropic" in skill names (reserved)

### 5. File Rules
- Required: `SKILL.md` (exact case)
- No `README.md` inside skill folders
- No XML angle brackets in frontmatter
- Reference files: focused, small, loaded on-demand
- Scripts: atomic, testable, well-documented

## Quality Gates

Before marking any generated skill as complete:
- [ ] SKILL.md exists with valid YAML frontmatter
- [ ] Name is valid kebab-case (1-64 chars)
- [ ] Description includes WHAT + WHEN + keywords (<1024 chars)
- [ ] No XML tags in frontmatter
- [ ] Instructions are specific and actionable (not vague)
- [ ] Error handling included for common failures
- [ ] Examples provided for key workflows
- [ ] SKILL.md body under 500 lines
- [ ] Reference files linked (not inlined) for detailed knowledge
- [ ] Scripts have docstrings, type hints, error handling

Run `python scripts/validate_skill.py <path>` to verify programmatically.

## Reference Files

Load on-demand as needed -- do NOT load all at startup:
- `references/anatomy.md` -- Skill file structure, naming rules, agent format
- `references/patterns.md` -- Proven workflow patterns with examples
- `references/frontmatter-spec.md` -- YAML frontmatter specification (skills)
- `references/description-guide.md` -- Writing trigger-optimized descriptions
- `references/testing-guide.md` -- Testing methodology and checklist
- `references/pro-agent.md` -- 3-layer architecture deep dive
- `references/tools-reference.md` -- All tool names, permission patterns, MCP
- `references/hooks-reference.md` -- Hook events, types, quality gate patterns
- `references/skills-activation.md` -- Skill discovery, activation, advanced features
- `references/platforms.md` -- Platform specs and conversion rules

## Sub-Skills

This skill orchestrates 8 specialized sub-skills:

1. **skill-forge-plan** -- Architecture design and use case planning
2. **skill-forge-build** -- Scaffold and generate skill files
3. **skill-forge-review** -- Audit and validate existing skills
4. **skill-forge-evolve** -- Improve skills based on feedback
5. **skill-forge-eval** -- Run eval pipeline with assertions and grading
6. **skill-forge-benchmark** -- Benchmark performance with variance analysis
7. **skill-forge-publish** -- Package and prepare for distribution
8. **skill-forge-convert** -- Convert skills for Codex, Gemini CLI, Antigravity, Cursor

Related Skills

skill-forge-review

39
from AgriciDaniel/skill-forge

Audit and validate existing Claude Code skills for quality, triggering accuracy, structure compliance, and best practices. Scores skills on a 0-100 scale and provides prioritized improvement recommendations. Use when user says "review skill", "audit skill", "check skill", "validate skill", or "skill quality".

skill-forge-publish

39
from AgriciDaniel/skill-forge

Package and distribute Claude Code skills for sharing via GitHub, Claude.ai uploads, or team deployment. Creates install scripts, documentation, and .skill packages. Use when user says "publish skill", "share skill", "package skill", "distribute skill", or "release skill".

skill-forge-plan

39
from AgriciDaniel/skill-forge

Architecture and design planning for new Claude Code skills. Guides through use case definition, complexity tier selection, sub-skill decomposition, and file structure planning. Use when user says "plan skill", "design skill", "skill architecture", or "skill planning".

skill-forge-evolve

39
from AgriciDaniel/skill-forge

Improve and iterate on existing Claude Code skills based on usage feedback, test results, or changing requirements. Handles under/over-triggering fixes, instruction refinement, new sub-skill addition, and architecture evolution. Use when user says "improve skill", "fix skill", "skill not triggering", "skill triggers too much", "update skill", or "evolve skill".

skill-forge-eval

39
from AgriciDaniel/skill-forge

Run evaluation pipelines on Claude Code skills to test triggering accuracy, workflow correctness, and output quality. Spawns executor, grader, comparator, and analyzer sub-agents for parallel evaluation. Generates eval_metadata.json, grading.json, and feedback reports. Use when user says "eval skill", "test skill", "run evals", "evaluate skill", "skill evals", "test skill quality", "run skill tests", or "skill evaluation".

skill-forge-convert

39
from AgriciDaniel/skill-forge

Convert Claude Code skills to work on OpenAI Codex, Google Gemini CLI, Google Antigravity, and Cursor. Analyzes platform-specific features, generates target files (openai.yaml, AGENTS.md, GEMINI.md, .mdc rules), adapts frontmatter, converts MCP config, and produces compatibility reports. Use when user says "convert skill", "port skill", "multi-platform", "skill for codex", "skill for gemini", "skill for antigravity", "skill for cursor", "cross-platform skill", "convert to codex", "convert to gemini", "convert to antigravity", or "convert to cursor".

skill-forge-build

39
from AgriciDaniel/skill-forge

Scaffold and build Claude Code skills from plans or descriptions. Generates SKILL.md files, sub-skills, scripts, references, agents, and templates following the Agent Skills standard. Use when user says "build skill", "scaffold skill", "generate skill", "create SKILL.md", or "implement skill".

skill-forge-benchmark

39
from AgriciDaniel/skill-forge

Benchmark Claude Code skill performance with variance analysis, tracking pass rate, execution time, and token usage across iterations. Runs multiple trials per eval for statistical reliability, aggregates results into benchmark.json, and generates comparison reports between skill versions. Use when user says "benchmark skill", "measure skill performance", "skill metrics", "compare skill versions", "skill performance", "track skill improvement", "skill regression test", or "skill A/B test".

plugin-forge

24269
from davila7/claude-code-templates

Create and manage Claude Code plugins with proper structure, manifests, and marketplace integration. Use when creating plugins for a marketplace, adding plugin components (commands, agents, hooks), bumping plugin versions, or working with plugin.json/marketplace.json manifests.

torchforge-rl-training

24269
from davila7/claude-code-templates

Provides guidance for PyTorch-native agentic RL using torchforge, Meta's library separating infra from algorithms. Use when you want clean RL abstractions, easy algorithm experimentation, or scalable training with Monarch and TorchTitan.

exploiting-server-side-request-forgery

4032
from mukul975/Anthropic-Cybersecurity-Skills

Identifying and exploiting SSRF vulnerabilities to access internal services, cloud metadata, and restricted network resources during authorized penetration tests.

detecting-golden-ticket-forgery

4032
from mukul975/Anthropic-Cybersecurity-Skills

Detect Kerberos Golden Ticket forgery by analyzing Windows Event ID 4769 for RC4 encryption downgrades (0x17), abnormal ticket lifetimes, and krbtgt account anomalies in Splunk and Elastic SIEM