html-structure-validate

Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.

25 stars

byComeOnOliver

View on GitHub Installation ↓

Best use case

html-structure-validate is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.

Teams using html-structure-validate should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/html-structure-validate/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/aiskillstore/marketplace/abejitsu/html-structure-validate/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/html-structure-validate/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How html-structure-validate Compares

Feature / Agent	html-structure-validate	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# HTML Structure Validate Skill

## Purpose

This skill is a **BLOCKING quality gate** that ensures generated HTML meets minimum structural requirements. It is the **first deterministic validation** of probabilistic AI-generated output.

The skill checks:
- **HTML5 compliance** - Proper DOCTYPE, tags
- **Tag closure** - All tags properly closed
- **Required elements** - Meta tags, stylesheet links
- **Well-formedness** - Valid structure

If validation fails, the pipeline **STOPS** and triggers a hook to notify the user.

This enforces the principle: **Python validates, ensuring deterministic quality**.

## What to Do

1. **Load HTML file to validate**
   - Read `04_page_XX.html` generated by AI skill
   - Verify file exists and is readable
   - Confirm file is text (not binary)

2. **Run validation checks**
   - Check HTML5 structure compliance
   - Verify tag closure
   - Validate head section
   - Check required CSS link
   - Validate page container structure

3. **Generate validation report**
   - Document all checks performed
   - List any errors found
   - Note warnings (non-blocking)
   - Record informational findings

4. **Save validation report** as JSON
   - Save to: `output/chapter_XX/page_artifacts/page_YY/06_validation_structure.json`
   - Include timestamp
   - Include all check results

5. **Exit with appropriate code**
   - Return 0 if VALID (continue pipeline)
   - Return 1 if INVALID (STOP pipeline, trigger hook)

## Input Parameters

```
html_file: <str>         - Path to 04_page_XX.html
output_dir: <str>        - Directory for validation report
strict_mode: <bool>      - If true, warnings also fail (default: false)
page_number: <int>       - Page number (for reporting)
chapter: <int>           - Chapter number (for reporting)
```

## Validation Checks

### Check 1: DOCTYPE Declaration

**Requirement**: File must start with proper DOCTYPE
```html
<!DOCTYPE html>
```

**Check**:
- [ ] File contains `<!DOCTYPE html>` (case-insensitive)
- [ ] DOCTYPE appears before any tags
- [ ] DOCTYPE is on first line or near beginning

**Error if**: Missing or incorrect DOCTYPE

### Check 2: HTML Tags

**Requirement**: Proper `<html>` opening and closing tags
```html
<html lang="en">
    ...
</html>
```

**Checks**:
- [ ] `<html>` tag present
- [ ] `</html>` closing tag present
- [ ] Tags are properly paired
- [ ] No unclosed `<html>` tags

**Error if**: Missing either tag or improperly paired

### Check 3: Head Section

**Requirement**: Complete `<head>` section with metadata
```html
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>...</title>
    <link rel="stylesheet" href="../../styles/main.css">
</head>
```

**Checks**:
- [ ] `<head>` and `</head>` tags present
- [ ] `<meta charset="UTF-8">` present
- [ ] `<meta name="viewport">` present (warning if missing)
- [ ] `<title>` tag with content present
- [ ] CSS `<link>` tag present with href attribute

**Error if**: Missing charset, title, or CSS link
**Warning if**: Missing viewport meta tag

### Check 4: Body Section

**Requirement**: Proper `<body>` tags with content
```html
<body>
    <div class="page-container">
        <main class="page-content">
            ...
        </main>
    </div>
</body>
```

**Checks**:
- [ ] `<body>` and `</body>` tags present
- [ ] `<div class="page-container">` present
- [ ] `<main class="page-content">` present inside container
- [ ] Body contains substantial content (> 100 bytes)

**Error if**: Missing tags or required container divs

### Check 5: Tag Closure Validation

**Requirement**: All tags must be properly closed

**Checks for**:
- Unmatched opening tags (e.g., `<p>` without `</p>`)
- Improper nesting (e.g., `<p><h2>text</h2></p>`)
- Self-closing tags used correctly (e.g., `<br/>`, `<img/>`)
- Comment blocks properly formatted (`<!-- -->`)

**Validation method**:
- Parse HTML into tree structure
- Verify all nodes properly matched
- Check nesting doesn't violate HTML5 rules

**Error if**: Any unmatched or improperly nested tags

### Check 6: Heading Tags (h1-h6)

**Requirement**: Valid heading hierarchy
```html
<h1>Chapter Title</h1>
<h2>Section Heading</h2>
<h3>Subsection</h3>
```

**Checks**:
- [ ] All heading tags properly closed
- [ ] First heading should be h1 (warning if not)
- [ ] Heading levels don't skip dramatically (h1 → h4 is suspicious)
- [ ] All headings have text content (not empty)

**Error if**: Heading tags improperly closed
**Warning if**: Suspicious hierarchy

### Check 7: Content Structure

**Requirement**: Meaningful content in page container

**Checks**:
- [ ] `<main class="page-content">` contains elements
- [ ] Content includes headings or paragraphs
- [ ] No completely empty content area
- [ ] Text nodes or elements present (> 100 words total)

**Error if**: No content or empty structure

### Check 8: List Integrity

**Requirement**: All lists properly structured

**Checks** for each `<ul>` or `<ol>`:
- [ ] List opening and closing tags matched
- [ ] List contains `<li>` elements
- [ ] All `<li>` tags properly closed
- [ ] `<li>` count matches opening/closing pairs
- [ ] No nested `<ul>` or `<ol>` improperly closed

**Error if**: Empty lists or unmatched `<li>` tags

### Check 9: Image and Link Tags

**Requirement**: Self-closing tags properly formatted

**Checks**:
- [ ] All `<img>` tags have `src` and `alt` attributes
- [ ] All `<a>` tags have valid `href` attributes
- [ ] Image paths don't have obvious errors (no broken syntax)
- [ ] Self-closing tags use proper syntax

**Warning if**: Images missing alt text or links missing href

### Check 10: Table Tags (if present)

**Requirement**: Proper table structure

**Checks**:
- [ ] `<table>`, `<tr>`, `<td>`, `<th>` tags properly nested
- [ ] All rows have consistent column counts
- [ ] Table headers and body properly structured

**Error if**: Malformed table structure

## Validation Report Format

### Output: `06_validation_structure.json`

```json
{
  "page": 16,
  "book_page": 17,
  "chapter": 2,
  "validation_type": "structure",
  "validation_timestamp": "2025-11-08T14:34:00Z",
  "overall_status": "PASS",
  "error_count": 0,
  "warning_count": 1,
  "checks_performed": [
    {
      "check_name": "DOCTYPE Declaration",
      "status": "PASS",
      "details": "Valid HTML5 DOCTYPE found"
    },
    {
      "check_name": "HTML Tags",
      "status": "PASS",
      "details": "Proper <html> opening and closing tags"
    },
    {
      "check_name": "Head Section",
      "status": "PASS",
      "details": "All required meta tags and title present"
    },
    {
      "check_name": "Body Section",
      "status": "PASS",
      "details": "Body and content structure valid"
    },
    {
      "check_name": "Tag Closure",
      "status": "PASS",
      "details": "All tags properly matched and closed"
    },
    {
      "check_name": "Heading Hierarchy",
      "status": "PASS",
      "details": "4 headings found, proper h1-h4 hierarchy"
    },
    {
      "check_name": "Content Structure",
      "status": "PASS",
      "details": "Main content area contains 245 words across 3 paragraphs"
    },
    {
      "check_name": "List Integrity",
      "status": "PASS",
      "details": "1 list with 3 items, all properly formed"
    },
    {
      "check_name": "Image Tags",
      "status": "PASS",
      "details": "No images on this page"
    },
    {
      "check_name": "Table Tags",
      "status": "PASS",
      "details": "No tables on this page"
    }
  ],
  "errors": [],
  "warnings": [
    {
      "check": "Heading Hierarchy",
      "message": "First heading is h2, typically should be h1 for page opening",
      "severity": "LOW"
    }
  ],
  "summary": {
    "total_checks": 10,
    "passed": 9,
    "failed": 0,
    "warnings": 1,
    "html_valid": true,
    "tags_matched": true,
    "content_substantial": true
  }
}
```

## Validation Rules

### PASS Criteria
- DOCTYPE present and valid
- All required tags (`html`, `head`, `body`, `main`, `div.page-container`) present
- All tags properly closed and matched
- Title tag with content
- CSS stylesheet link present
- Content structure valid
- No structural errors

### FAIL Criteria (BLOCKS PIPELINE)
- Missing DOCTYPE
- Missing required tags
- Unmatched or improperly nested tags
- Missing title or CSS link
- Empty content
- Malformed lists or tables

### WARNING (Logged but doesn't block)
- Missing viewport meta tag
- First heading is not h1
- Large heading jumps (h1 → h4)
- Missing alt text on images
- Missing href on links

## Implementation: Using Python Script

This validation is performed by existing `validate_html.py` tool, run in **structure validation mode**:

```bash
cd Calypso/tools

# Validate single page HTML
python3 validate_html.py \
  ../output/chapter_02/page_artifacts/page_16/04_page_16.html \
  --output-json ../output/chapter_02/page_artifacts/page_16/06_validation_structure.json \
  --strict-structure

# Exit code:
# 0 = VALID (continue to next skill)
# 1 = INVALID (STOP pipeline)
```

## Hook Integration

When validation **FAILS**:

```bash
# Trigger hook: .claude/hooks/validate-structure.sh
# Receives:
#   - Page number
#   - HTML file path
#   - Validation report path
#   - Error details

# Hook behavior:
# - Log failure with details
# - Save error report
# - Notify user
# - STOP pipeline (no further processing)
```

## Error Recovery

**If validation fails**:
1. User reviews validation report
2. User identifies issue in AI-generated HTML
3. Options:
   - Fix HTML manually and re-validate
   - Re-run AI generation with improved prompt
   - Review source extraction data for errors
   - Proceed with caution (expert override)

## Quality Metrics

Validation provides metrics:
- Percentage of checks passing
- Error severity levels
- Content size (word count, element count)
- Structure complexity

These metrics feed into final quality reports.

## Success Criteria

✓ Validation completes successfully
✓ All structural checks pass (0 errors)
✓ Validation report saved in JSON format
✓ Exit code 0 returned (or 1 if invalid)
✓ Clear error messages if validation fails

## Next Steps After PASS

If validation passes:
1. All pages of chapter processed through this gate
2. **Skill 4** (consolidate pages) merges individual page HTMLs
3. **Quality Gate 2** (semantic validate) checks semantic structure
4. Continue through validation pipeline

## Next Steps After FAIL

If validation fails:
1. **PIPELINE STOPS**
2. Hook `validate-structure.sh` triggered
3. User receives error report with details
4. User must fix issues and retry

## Design Notes

- This is the **first deterministic quality gate**
- Uses proven `validate_html.py` tool
- Catches structural issues before semantic analysis
- Provides clear, actionable error messages
- Essential for ensuring pipeline reliability

## Testing

To test structure validation:

```bash
# Test with known-good HTML
python3 validate_html.py ../output/chapter_01/chapter_01.html

# Should show: ✓ VALID

# Test with invalid HTML (if needed)
python3 validate_html.py broken_html.html

# Should show: ✗ INVALID with specific errors
```

Related Skills

collecting-infrastructure-metrics

from ComeOnOliver/skillshub

This skill enables Claude to collect comprehensive infrastructure performance metrics across compute, storage, network, containers, load balancers, and databases. It is triggered when the user requests "collect infrastructure metrics", "monitor server performance", "set up performance dashboards", or needs to analyze system resource utilization. The skill configures metrics collection, sets up aggregation, and helps create infrastructure dashboards for health monitoring and capacity tracking. It supports configuration for Prometheus, Datadog, and CloudWatch.

detecting-infrastructure-drift

from ComeOnOliver/skillshub

This skill enables Claude to detect infrastructure drift from a desired state. It uses the `drift-detect` command to identify discrepancies between the current infrastructure configuration and the intended configuration, as defined in infrastructure-as-code tools like Terraform. Use this skill when the user asks to check for infrastructure drift, identify configuration changes, or ensure that the current infrastructure matches the desired state. It is particularly useful in DevOps workflows for maintaining infrastructure consistency and preventing configuration errors. Trigger this skill when the user mentions "drift detection," "infrastructure changes," "configuration drift," or requests a "drift report."

generating-infrastructure-as-code

from ComeOnOliver/skillshub

This skill enables Claude to generate Infrastructure as Code (IaC) configurations. It uses the infrastructure-as-code-generator plugin to create production-ready IaC for Terraform, CloudFormation, Pulumi, ARM Templates, and CDK. Use this skill when the user requests IaC configurations for cloud infrastructure, specifying the platform (e.g., Terraform, CloudFormation) and cloud provider (e.g., AWS, Azure, GCP), or when the user needs help automating infrastructure deployment. Trigger terms include: "generate IaC", "create Terraform", "CloudFormation template", "Pulumi program", "infrastructure code".

checking-infrastructure-compliance

from ComeOnOliver/skillshub

Execute use when you need to work with compliance checking. This skill provides compliance monitoring and validation with comprehensive guidance and automation. Trigger with phrases like "check compliance", "validate policies", or "audit compliance".

structured-autonomy-plan

from ComeOnOliver/skillshub

Structured Autonomy Planning Prompt

structured-autonomy-implement

from ComeOnOliver/skillshub

Structured Autonomy Implementation Prompt

structured-autonomy-generate

from ComeOnOliver/skillshub

Structured Autonomy Implementation Generator Prompt

markdown-to-html

from ComeOnOliver/skillshub

Convert Markdown files to HTML similar to `marked.js`, `pandoc`, `gomarkdown/markdown`, or similar tools; or writing custom script to convert markdown to html and/or working on web template systems like `jekyll/jekyll`, `gohugoio/hugo`, or similar web templating systems that utilize markdown documents, converting them to html. Use when asked to "convert markdown to html", "transform md to html", "render markdown", "generate html from markdown", or when working with .md files and/or web a templating system that converts markdown to HTML output. Supports CLI and Node.js workflows with GFM, CommonMark, and standard Markdown flavors.

import-infrastructure-as-code

from ComeOnOliver/skillshub

Import existing Azure resources into Terraform using Azure CLI discovery and Azure Verified Modules (AVM). Use when asked to reverse-engineer live Azure infrastructure, generate Infrastructure as Code from existing subscriptions/resource groups/resource IDs, map dependencies, derive exact import addresses from downloaded module source, prevent configuration drift, and produce AVM-based Terraform files ready for validation and planning across any Azure resource type.

folder-structure-blueprint-generator

from ComeOnOliver/skillshub

Comprehensive technology-agnostic prompt for analyzing and documenting project folder structures. Auto-detects project types (.NET, Java, React, Angular, Python, Node.js, Flutter), generates detailed blueprints with visualization options, naming conventions, file placement patterns, and extension templates for maintaining consistent code organization across diverse technology stacks.

go-data-structures

from ComeOnOliver/skillshub

Use when working with Go slices, maps, or arrays — choosing between new and make, using append, declaring empty slices (nil vs literal for JSON), implementing sets with maps, and copying data at boundaries. Also use when building or manipulating collections, even if the user doesn't ask about allocation idioms. Does not cover concurrent data structure safety (see go-concurrency).

validate-skills

from ComeOnOliver/skillshub

Validates skills in this repo against agentskills.io spec and Claude Code best practices. Use via /validate-skills command.