html-structure-validate

Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.

242 stars

Best use case

html-structure-validate is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.

Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "html-structure-validate" skill to help with this workflow task. Context: Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

  • Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

  • Do not use this when you only need a one-off answer and do not need a reusable workflow.
  • Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/html-structure-validate/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/abejitsu/html-structure-validate/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/html-structure-validate/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How html-structure-validate Compares

Feature / Agenthtml-structure-validateStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Validate HTML5 structure and basic syntax. BLOCKING quality gate - stops pipeline if validation fails. Ensures deterministic output quality.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# HTML Structure Validate Skill

## Purpose

This skill is a **BLOCKING quality gate** that ensures generated HTML meets minimum structural requirements. It is the **first deterministic validation** of probabilistic AI-generated output.

The skill checks:
- **HTML5 compliance** - Proper DOCTYPE, tags
- **Tag closure** - All tags properly closed
- **Required elements** - Meta tags, stylesheet links
- **Well-formedness** - Valid structure

If validation fails, the pipeline **STOPS** and triggers a hook to notify the user.

This enforces the principle: **Python validates, ensuring deterministic quality**.

## What to Do

1. **Load HTML file to validate**
   - Read `04_page_XX.html` generated by AI skill
   - Verify file exists and is readable
   - Confirm file is text (not binary)

2. **Run validation checks**
   - Check HTML5 structure compliance
   - Verify tag closure
   - Validate head section
   - Check required CSS link
   - Validate page container structure

3. **Generate validation report**
   - Document all checks performed
   - List any errors found
   - Note warnings (non-blocking)
   - Record informational findings

4. **Save validation report** as JSON
   - Save to: `output/chapter_XX/page_artifacts/page_YY/06_validation_structure.json`
   - Include timestamp
   - Include all check results

5. **Exit with appropriate code**
   - Return 0 if VALID (continue pipeline)
   - Return 1 if INVALID (STOP pipeline, trigger hook)

## Input Parameters

```
html_file: <str>         - Path to 04_page_XX.html
output_dir: <str>        - Directory for validation report
strict_mode: <bool>      - If true, warnings also fail (default: false)
page_number: <int>       - Page number (for reporting)
chapter: <int>           - Chapter number (for reporting)
```

## Validation Checks

### Check 1: DOCTYPE Declaration

**Requirement**: File must start with proper DOCTYPE
```html
<!DOCTYPE html>
```

**Check**:
- [ ] File contains `<!DOCTYPE html>` (case-insensitive)
- [ ] DOCTYPE appears before any tags
- [ ] DOCTYPE is on first line or near beginning

**Error if**: Missing or incorrect DOCTYPE

### Check 2: HTML Tags

**Requirement**: Proper `<html>` opening and closing tags
```html
<html lang="en">
    ...
</html>
```

**Checks**:
- [ ] `<html>` tag present
- [ ] `</html>` closing tag present
- [ ] Tags are properly paired
- [ ] No unclosed `<html>` tags

**Error if**: Missing either tag or improperly paired

### Check 3: Head Section

**Requirement**: Complete `<head>` section with metadata
```html
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>...</title>
    <link rel="stylesheet" href="../../styles/main.css">
</head>
```

**Checks**:
- [ ] `<head>` and `</head>` tags present
- [ ] `<meta charset="UTF-8">` present
- [ ] `<meta name="viewport">` present (warning if missing)
- [ ] `<title>` tag with content present
- [ ] CSS `<link>` tag present with href attribute

**Error if**: Missing charset, title, or CSS link
**Warning if**: Missing viewport meta tag

### Check 4: Body Section

**Requirement**: Proper `<body>` tags with content
```html
<body>
    <div class="page-container">
        <main class="page-content">
            ...
        </main>
    </div>
</body>
```

**Checks**:
- [ ] `<body>` and `</body>` tags present
- [ ] `<div class="page-container">` present
- [ ] `<main class="page-content">` present inside container
- [ ] Body contains substantial content (> 100 bytes)

**Error if**: Missing tags or required container divs

### Check 5: Tag Closure Validation

**Requirement**: All tags must be properly closed

**Checks for**:
- Unmatched opening tags (e.g., `<p>` without `</p>`)
- Improper nesting (e.g., `<p><h2>text</h2></p>`)
- Self-closing tags used correctly (e.g., `<br/>`, `<img/>`)
- Comment blocks properly formatted (`<!-- -->`)

**Validation method**:
- Parse HTML into tree structure
- Verify all nodes properly matched
- Check nesting doesn't violate HTML5 rules

**Error if**: Any unmatched or improperly nested tags

### Check 6: Heading Tags (h1-h6)

**Requirement**: Valid heading hierarchy
```html
<h1>Chapter Title</h1>
<h2>Section Heading</h2>
<h3>Subsection</h3>
```

**Checks**:
- [ ] All heading tags properly closed
- [ ] First heading should be h1 (warning if not)
- [ ] Heading levels don't skip dramatically (h1 → h4 is suspicious)
- [ ] All headings have text content (not empty)

**Error if**: Heading tags improperly closed
**Warning if**: Suspicious hierarchy

### Check 7: Content Structure

**Requirement**: Meaningful content in page container

**Checks**:
- [ ] `<main class="page-content">` contains elements
- [ ] Content includes headings or paragraphs
- [ ] No completely empty content area
- [ ] Text nodes or elements present (> 100 words total)

**Error if**: No content or empty structure

### Check 8: List Integrity

**Requirement**: All lists properly structured

**Checks** for each `<ul>` or `<ol>`:
- [ ] List opening and closing tags matched
- [ ] List contains `<li>` elements
- [ ] All `<li>` tags properly closed
- [ ] `<li>` count matches opening/closing pairs
- [ ] No nested `<ul>` or `<ol>` improperly closed

**Error if**: Empty lists or unmatched `<li>` tags

### Check 9: Image and Link Tags

**Requirement**: Self-closing tags properly formatted

**Checks**:
- [ ] All `<img>` tags have `src` and `alt` attributes
- [ ] All `<a>` tags have valid `href` attributes
- [ ] Image paths don't have obvious errors (no broken syntax)
- [ ] Self-closing tags use proper syntax

**Warning if**: Images missing alt text or links missing href

### Check 10: Table Tags (if present)

**Requirement**: Proper table structure

**Checks**:
- [ ] `<table>`, `<tr>`, `<td>`, `<th>` tags properly nested
- [ ] All rows have consistent column counts
- [ ] Table headers and body properly structured

**Error if**: Malformed table structure

## Validation Report Format

### Output: `06_validation_structure.json`

```json
{
  "page": 16,
  "book_page": 17,
  "chapter": 2,
  "validation_type": "structure",
  "validation_timestamp": "2025-11-08T14:34:00Z",
  "overall_status": "PASS",
  "error_count": 0,
  "warning_count": 1,
  "checks_performed": [
    {
      "check_name": "DOCTYPE Declaration",
      "status": "PASS",
      "details": "Valid HTML5 DOCTYPE found"
    },
    {
      "check_name": "HTML Tags",
      "status": "PASS",
      "details": "Proper <html> opening and closing tags"
    },
    {
      "check_name": "Head Section",
      "status": "PASS",
      "details": "All required meta tags and title present"
    },
    {
      "check_name": "Body Section",
      "status": "PASS",
      "details": "Body and content structure valid"
    },
    {
      "check_name": "Tag Closure",
      "status": "PASS",
      "details": "All tags properly matched and closed"
    },
    {
      "check_name": "Heading Hierarchy",
      "status": "PASS",
      "details": "4 headings found, proper h1-h4 hierarchy"
    },
    {
      "check_name": "Content Structure",
      "status": "PASS",
      "details": "Main content area contains 245 words across 3 paragraphs"
    },
    {
      "check_name": "List Integrity",
      "status": "PASS",
      "details": "1 list with 3 items, all properly formed"
    },
    {
      "check_name": "Image Tags",
      "status": "PASS",
      "details": "No images on this page"
    },
    {
      "check_name": "Table Tags",
      "status": "PASS",
      "details": "No tables on this page"
    }
  ],
  "errors": [],
  "warnings": [
    {
      "check": "Heading Hierarchy",
      "message": "First heading is h2, typically should be h1 for page opening",
      "severity": "LOW"
    }
  ],
  "summary": {
    "total_checks": 10,
    "passed": 9,
    "failed": 0,
    "warnings": 1,
    "html_valid": true,
    "tags_matched": true,
    "content_substantial": true
  }
}
```

## Validation Rules

### PASS Criteria
- DOCTYPE present and valid
- All required tags (`html`, `head`, `body`, `main`, `div.page-container`) present
- All tags properly closed and matched
- Title tag with content
- CSS stylesheet link present
- Content structure valid
- No structural errors

### FAIL Criteria (BLOCKS PIPELINE)
- Missing DOCTYPE
- Missing required tags
- Unmatched or improperly nested tags
- Missing title or CSS link
- Empty content
- Malformed lists or tables

### WARNING (Logged but doesn't block)
- Missing viewport meta tag
- First heading is not h1
- Large heading jumps (h1 → h4)
- Missing alt text on images
- Missing href on links

## Implementation: Using Python Script

This validation is performed by existing `validate_html.py` tool, run in **structure validation mode**:

```bash
cd Calypso/tools

# Validate single page HTML
python3 validate_html.py \
  ../output/chapter_02/page_artifacts/page_16/04_page_16.html \
  --output-json ../output/chapter_02/page_artifacts/page_16/06_validation_structure.json \
  --strict-structure

# Exit code:
# 0 = VALID (continue to next skill)
# 1 = INVALID (STOP pipeline)
```

## Hook Integration

When validation **FAILS**:

```bash
# Trigger hook: .claude/hooks/validate-structure.sh
# Receives:
#   - Page number
#   - HTML file path
#   - Validation report path
#   - Error details

# Hook behavior:
# - Log failure with details
# - Save error report
# - Notify user
# - STOP pipeline (no further processing)
```

## Error Recovery

**If validation fails**:
1. User reviews validation report
2. User identifies issue in AI-generated HTML
3. Options:
   - Fix HTML manually and re-validate
   - Re-run AI generation with improved prompt
   - Review source extraction data for errors
   - Proceed with caution (expert override)

## Quality Metrics

Validation provides metrics:
- Percentage of checks passing
- Error severity levels
- Content size (word count, element count)
- Structure complexity

These metrics feed into final quality reports.

## Success Criteria

✓ Validation completes successfully
✓ All structural checks pass (0 errors)
✓ Validation report saved in JSON format
✓ Exit code 0 returned (or 1 if invalid)
✓ Clear error messages if validation fails

## Next Steps After PASS

If validation passes:
1. All pages of chapter processed through this gate
2. **Skill 4** (consolidate pages) merges individual page HTMLs
3. **Quality Gate 2** (semantic validate) checks semantic structure
4. Continue through validation pipeline

## Next Steps After FAIL

If validation fails:
1. **PIPELINE STOPS**
2. Hook `validate-structure.sh` triggered
3. User receives error report with details
4. User must fix issues and retry

## Design Notes

- This is the **first deterministic quality gate**
- Uses proven `validate_html.py` tool
- Catches structural issues before semantic analysis
- Provides clear, actionable error messages
- Essential for ensuring pipeline reliability

## Testing

To test structure validation:

```bash
# Test with known-good HTML
python3 validate_html.py ../output/chapter_01/chapter_01.html

# Should show: ✓ VALID

# Test with invalid HTML (if needed)
python3 validate_html.py broken_html.html

# Should show: ✗ INVALID with specific errors
```

Related Skills

structure-judgment

242
from aiskillstore/marketplace

Front-end structural routing for mixed, ambiguous, high-noise inputs. Analyzes user inputs to determine the primary structural layer and routes to the appropriate handling layer before answering.

xss-html-injection

242
from aiskillstore/marketplace

This skill should be used when the user asks to "test for XSS vulnerabilities", "perform cross-site scripting attacks", "identify HTML injection flaws", "exploit client-side injection...

terraform-infrastructure

242
from aiskillstore/marketplace

Terraform infrastructure as code workflow for provisioning cloud resources, creating reusable modules, and managing infrastructure at scale.

seo-structure-architect

242
from aiskillstore/marketplace

Analyzes and optimizes content structure including header hierarchy, suggests schema markup, and internal linking opportunities. Creates search-friendly content organization. Use PROACTIVELY for content structuring.

lint-and-validate

242
from aiskillstore/marketplace

Automatic quality control, linting, and static analysis procedures. Use after every code modification to ensure syntax correctness and project standards. Triggers onKeywords: lint, format, check, validate, types, static analysis.

deployment-validation-config-validate

242
from aiskillstore/marketplace

You are a configuration management expert specializing in validating, testing, and ensuring the correctness of application configurations. Create comprehensive validation schemas, implement configurat

data-structure-protocol

242
from aiskillstore/marketplace

Give agents persistent structural memory of a codebase — navigate dependencies, track public APIs, and understand why connections exist without re-reading the whole repo.

cross-site-scripting-and-html-injection-testing

242
from aiskillstore/marketplace

This skill should be used when the user asks to "test for XSS vulnerabilities", "perform cross-site scripting attacks", "identify HTML injection flaws", "exploit client-side injection vulnerabilities", "steal cookies via XSS", or "bypass content security policies". It provides comprehensive techniques for detecting, exploiting, and understanding XSS and HTML injection attack vectors in web applications.

azure-validate

242
from aiskillstore/marketplace

Pre-deployment validation checkpoint. Run deep checks to ensure your application is ready for Azure deployment. Validates configuration, infrastructure, permissions, and prerequisites. USE FOR: validate my app, check deployment readiness, run preflight checks, verify configuration, check if ready to deploy, validate azure.yaml, validate Bicep, test before deploying, troubleshoot deployment errors. DO NOT USE FOR: creating or building apps (use azure-prepare), executing deployments (use azure-deploy).

baoyu-markdown-to-html

242
from aiskillstore/marketplace

Converts Markdown to styled HTML with WeChat-compatible themes. Supports code highlighting, math, PlantUML, footnotes, alerts, and infographics. Use when user asks for "markdown to html", "convert md to html", "md转html", or needs styled HTML output from markdown.

project-structure

242
from aiskillstore/marketplace

Organize project folders following industry best practices. Use when setting up new projects, reorganizing codebases, or when folder structure becomes messy. Covers Next.js, Bulletproof React, and FSD patterns.

infrastructure-reporting

242
from aiskillstore/marketplace

Generate comprehensive network infrastructure reports including health status, performance analysis, security audits, and capacity planning recommendations.