identify-page-structure

Identify section boundaries and content sequences within a scraped webpage for AEM Edge Delivery Services import. Performs two-level analysis (sections, then sequences per section) and surveys available blocks.

7 stars

Best use case

identify-page-structure is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Identify section boundaries and content sequences within a scraped webpage for AEM Edge Delivery Services import. Performs two-level analysis (sections, then sequences per section) and surveys available blocks.

Teams using identify-page-structure should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/identify-page-structure/SKILL.md --create-dirs "https://raw.githubusercontent.com/ddttom/webcomponents-with-eds/main/.claude/skills/identify-page-structure/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/identify-page-structure/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How identify-page-structure Compares

Feature / Agentidentify-page-structureStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Identify section boundaries and content sequences within a scraped webpage for AEM Edge Delivery Services import. Performs two-level analysis (sections, then sequences per section) and surveys available blocks.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Identify Page Structure

Analyze webpage structure using two-level hierarchy: sections, then content sequences within each section.

## When to Use This Skill

Use this skill when:
- You have scraped webpage output (screenshot, HTML, metadata)
- Need to identify section boundaries and content sequences
- Ready to understand page structure before making authoring decisions

**Invoked by:** page-import skill (Step 2)

## Prerequisites

From scrape-webpage skill, you need:
- ✅ screenshot.png showing full page
- ✅ cleaned.html with page content
- ✅ metadata.json with paths

## Related Skills

- **page-import** - Orchestrator that invokes this skill
- **scrape-webpage** - Provides input (screenshot, HTML)
- **page-decomposition** - This skill invokes it for EACH section
- **block-inventory** - This skill invokes it to survey available blocks
- **authoring-analysis** - Uses this skill's output to make authoring decisions

## Key Concepts

**CRITICAL:** Content follows a strict two-level hierarchy:

```
DOCUMENT
├── SECTION (top-level container with optional metadata)
│   ├── Content Sequence 1 (default content OR block)
│   ├── Content Sequence 2 (default content OR block)
│   └── ...
├── SECTION
│   └── Content Sequence 1
└── ...
```

**This skill analyzes BOTH levels:**
- Level 1: Section boundaries (Step 2a)
- Level 2: Content sequences within EACH section (Step 2b per section)

## Structure Identification Workflow

### Step 2a: Identify Section Boundaries (Level 1)

Examine the **screenshot** to find visual/thematic breaks that indicate new sections.

**Visual cues for section boundaries:**
- Background color changes (white → grey → dark → white)
- Spacing/padding changes (tight → wide → normal)
- Clear horizontal breaks or dividers
- Thematic content shifts

**What to exclude:**
- Header/navigation (auto-populated)
- Footer (auto-populated)
- Cookie banners, popups

**For each section, note:**
- Section number (sequential: 1, 2, 3...)
- Visual style (light, dark, grey, accent)
- Brief overview of what's in it

**Example output:**
```
Section 1: light background, hero content
Section 2: light background, grid of features
Section 3: grey background, article cards
Section 4: dark background, tabs
```

---

### Step 2b: Analyze Content Sequences Within Each Section (Level 2)

For EACH section identified in Step 2a, analyze its internal content sequences.

**What is a "content sequence"?**
A vertical flow of related content that will become EITHER:
- Default content (headings, paragraphs, lists, inline images)
- A block (structured, repeating, or interactive component)

**Breaking points between sequences:**
- Change from default content → block
- Change from block → different block
- Change from block → default content

**INVOKE page-decomposition skill FOR EACH SECTION** to get neutral descriptions.

**For each section, get:**
- Sequence 1: [Neutral description - NO block names yet]
- Sequence 2: [Neutral description]
- ...

**Example output:**
```
Section 1 (light):
  - Sequence 1: Large centered heading, paragraph, two buttons
  - Sequence 2: Two images displayed side-by-side

Section 2 (light):
  - Sequence 1: Centered heading
  - Sequence 2: Grid of 8 items, each with icon and short text
  - Sequence 3: Two centered buttons

Section 3 (grey):
  - Sequence 1: Eyebrow text, heading, paragraph, button
  - Sequence 2: Four items in grid, each with image, category tag, heading, description

Section 4 (dark):
  - Sequence 1: Tab navigation with three switchable content panels
```

---

### Step 2.5: Survey Available Blocks

**STOP: Before making any authoring decisions, understand what blocks are available.**

**INVOKE block-inventory skill** to catalog available blocks.

**Why this matters:**
Real authors see a block library and choose from available options. You need the same context to make authentic authoring decisions following David's Model.

**What this provides:**
- Local blocks already in project
- Common Block Collection blocks that can be added
- Purpose/description for each block
- Live example URLs

**Example output:**
```
Available Blocks:

LOCAL BLOCKS:
- custom-banner: Special promotional banner
- testimonial-slider: Customer testimonials carousel

BLOCK COLLECTION AVAILABLE:
- hero: Large heading, text, buttons for page intro
- cards: Grid of items with images/text
- columns: Side-by-side content layout
- accordion: Expandable Q&A sections
- tabs: Switchable content panels
- carousel: Rotating image/content displays
- quote: Highlighted testimonials
- fragment: Reusable content sections
```

---

## Output Format

This skill provides complete page structure:

**1. Section boundaries with styling:**
```
Section 1: light background
Section 2: light background
Section 3: grey background (#f5f5f5)
Section 4: dark background (#1a1a1a)
```

**2. Content sequences per section (neutral descriptions):**
```
Section 1 (light):
  - Sequence 1: Large centered heading, paragraph, two call-to-action buttons
  - Sequence 2: Two images displayed side-by-side

Section 2 (light):
  - Sequence 1: Single centered heading
  - Sequence 2: Grid of 8 items, each with icon and short text
  - Sequence 3: Two centered buttons

[Continue for all sections...]
```

**3. Block palette:**
```
LOCAL BLOCKS: [list]
BLOCK COLLECTION AVAILABLE: [list with purposes]
```

**Next step:** Pass these outputs to authoring-analysis skill

---

## Key Principles

**Two-level analysis is mandatory:**
- You MUST identify sections first (2a)
- Then analyze each section's content sequences (2b)
- Don't skip levels or combine them

**Stay neutral at this stage:**
- Describe WHAT you see, not WHAT it should be
- "Grid of items with images" not "Cards block"
- Authoring decisions come in next skill

**Block inventory before decisions:**
- Survey blocks BEFORE making any authoring choices
- Authors see a library and choose - you need same context

Related Skills

scrape-webpage

7
from ddttom/webcomponents-with-eds

Scrape webpage content, extract metadata, download images, and prepare for import/migration to AEM Edge Delivery Services. Returns analysis JSON with paths, metadata, cleaned HTML, and local images.

page-import

7
from ddttom/webcomponents-with-eds

Import a single webpage from any URL to structured HTML content for authoring in AEM Edge Delivery Services. Scrapes the page, analyzes structure, maps to existing blocks, and generates HTML for immediate local preview. Also triggered by terms like "migrate", "migration", or "migrating".

page-decomposition

7
from ddttom/webcomponents-with-eds

Analyze content sequences within a section and provide neutral descriptions for AEM Edge Delivery Services. Invoked per section during page import to identify breaking points between default content and blocks.

webapp-testing

7
from ddttom/webcomponents-with-eds

Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.

theme-factory

7
from ddttom/webcomponents-with-eds

Toolkit for styling artifacts with a theme. These artifacts can be slides, docs, reportings, HTML landing pages, etc. There are 10 pre-set themes with colors/fonts that you can apply to any artifact that has been creating, or can generate a new theme on-the-fly.

Testing Blocks

7
from ddttom/webcomponents-with-eds

Guide for testing code changes in AEM Edge Delivery projects including blocks, scripts, and styles. Use this skill after making code changes and before opening a pull request to validate functionality. Covers unit testing for utilities and logic, browser testing with Playwright/Puppeteer, linting, performance validation, and guidance on which tests to maintain vs use as throwaway validation.

template-skill

7
from ddttom/webcomponents-with-eds

Replace with description of the skill and when Claude should use it.

slack-gif-creator

7
from ddttom/webcomponents-with-eds

Toolkit for creating animated GIFs optimized for Slack, with validators for size constraints and composable animation primitives. This skill applies when users request animated GIFs or emoji animations for Slack from descriptions like "make me a GIF for Slack of X doing Y".

skill-developer

7
from ddttom/webcomponents-with-eds

Create and manage Claude Code skills following Anthropic best practices. Use when creating new skills, modifying skill-rules.json, understanding trigger patterns, working with hooks, debugging skill activation, or implementing progressive disclosure. Covers skill structure, YAML frontmatter, trigger types (keywords, intent patterns, file paths, content patterns), enforcement levels (block, suggest, warn), hook mechanisms (UserPromptSubmit, PreToolUse), session tracking, and the 500-line rule.

skill-creator

7
from ddttom/webcomponents-with-eds

Guide for creating effective skills. This skill should be used when users want to create a new skill (or update an existing skill) that extends Claude's capabilities with specialized knowledge, workflows, or tool integrations.

preview-import

7
from ddttom/webcomponents-with-eds

Preview and verify imported content in local AEM Edge Delivery Services dev server. Validates rendering, compares with original page, and troubleshoots common issues.

mcp-builder

7
from ddttom/webcomponents-with-eds

Guide for creating high-quality MCP (Model Context Protocol) servers that enable LLMs to interact with external services through well-designed tools. Use when building MCP servers to integrate external APIs or services, whether in Python (FastMCP) or Node/TypeScript (MCP SDK).