data-substrate-analysis

Analyze fundamental data primitives, type systems, and state management patterns in a codebase. Use when (1) evaluating typing strategies (Pydantic vs TypedDict vs loose dicts), (2) assessing immutability and mutation patterns, (3) understanding serialization approaches, (4) documenting state shape and lifecycle, or (5) comparing data modeling approaches across frameworks.

242 stars

byaiskillstore

View on GitHub Installation ↓

Best use case

data-substrate-analysis is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Analyze fundamental data primitives, type systems, and state management patterns in a codebase. Use when (1) evaluating typing strategies (Pydantic vs TypedDict vs loose dicts), (2) assessing immutability and mutation patterns, (3) understanding serialization approaches, (4) documenting state shape and lifecycle, or (5) comparing data modeling approaches across frameworks.

Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.

Practical example

Example input

Use the "data-substrate-analysis" skill to help with this workflow task. Context: Analyze fundamental data primitives, type systems, and state management patterns in a codebase. Use when (1) evaluating typing strategies (Pydantic vs TypedDict vs loose dicts), (2) assessing immutability and mutation patterns, (3) understanding serialization approaches, (4) documenting state shape and lifecycle, or (5) comparing data modeling approaches across frameworks.

Example output

A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.

When to use this skill

Use this skill when you want a reusable workflow rather than writing the same prompt again and again.

When not to use this skill

Do not use this when you only need a one-off answer and do not need a reusable workflow.
Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/data-substrate-analysis/SKILL.md --create-dirs "https://raw.githubusercontent.com/aiskillstore/marketplace/main/skills/dowwie/data-substrate-analysis/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/data-substrate-analysis/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How data-substrate-analysis Compares

Feature / Agent	data-substrate-analysis	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agent for Product Research

Browse AI agent skills for product research, competitive analysis, customer discovery, and structured product decision support.

AI Agent for SaaS Idea Validation

Use AI agent skills for SaaS idea validation, market research, customer discovery, competitor analysis, and documenting startup hypotheses.

SKILL.md Source

# Data Substrate Analysis

Analyzes the fundamental units of data and state management patterns.

## Process

1. **Locate type files** — Find types.py, schema.py, models.py, state.py
2. **Classify typing** — Strict (Pydantic), structural (TypedDict), loose (dict)
3. **Analyze mutation** — In-place modification vs. copy-on-write
4. **Document serialization** — json(), dict(), pickle, custom methods

## Typing Strategy Classification

### Detection Patterns

| Strategy | Indicators | Files to Check |
|----------|-----------|----------------|
| **Pydantic** | `BaseModel`, `Field()`, `validator` | models.py, schema.py |
| **Dataclass** | `@dataclass`, `field()` | types.py, models.py |
| **TypedDict** | `TypedDict`, `Required[]`, `NotRequired[]` | types.py |
| **NamedTuple** | `NamedTuple`, `typing.NamedTuple` | types.py |
| **Loose** | `Dict[str, Any]`, plain `dict` | Throughout |

### Analysis Questions

- Are boundaries validated (API ingress/egress)?
- Is nesting depth reasonable (<3 levels)?
- Are optional fields explicit or implicit None?
- Version migration path (Pydantic V1 → V2)?

## Immutability Analysis

### Mutable Patterns (Risk Indicators)

```python
# In-place list modification
state.messages.append(msg)
state.history.extend(new_items)

# Direct dict mutation
state['key'] = value
state.update(new_data)

# Object attribute mutation
state.status = 'complete'
```

### Immutable Patterns (Safer)

```python
# Pydantic copy
new_state = state.model_copy(update={'key': value})

# Dataclass replace
new_state = replace(state, messages=[*state.messages, msg])

# Spread operator style
new_state = {**state, 'key': value}

# Frozen dataclass
@dataclass(frozen=True)
class State: ...
```

## Serialization Strategy

### Common Patterns

| Method | Code Pattern | Trade-offs |
|--------|-------------|------------|
| Pydantic JSON | `.model_dump_json()` | Type-safe, automatic |
| Pydantic Dict | `.model_dump()` | For internal use |
| Dataclass | `asdict(obj)` | Manual, no validation |
| Custom | `to_dict()`, `from_dict()` | Full control |
| Pickle | `pickle.dumps()` | Fast, fragile, security risk |
| JSON | `json.dumps(obj, default=...)` | Requires encoder |

### Questions to Answer

- Is serialization implicit (automatic) or explicit (manual)?
- How are nested objects handled?
- Is deserialization validated?
- What happens with unknown fields?

## Output Template

```markdown
## Data Substrate Analysis: [Framework Name]

### Typing Strategy
- **Primary Approach**: [Pydantic/Dataclass/TypedDict/Loose]
- **Key Files**: [List of files]
- **Nesting Depth**: [Shallow/Medium/Deep]
- **Validation**: [At boundaries/Everywhere/None]

### Core Primitives

| Type | Location | Purpose | Mutability |
|------|----------|---------|------------|
| Message | schema.py:L15 | Chat message | Immutable |
| State | state.py:L42 | Agent state | Mutable ⚠️ |
| Result | types.py:L78 | Tool output | Immutable |

### Mutation Analysis
- **Pattern**: [In-place/Copy-on-write/Mixed]
- **Risk Areas**: [List of mutable state locations]
- **Concurrency Safe**: [Yes/No/Partial]

### Serialization
- **Method**: [Pydantic/Custom/JSON]
- **Implicit/Explicit**: [Description]
- **Round-trip Tested**: [Yes/No/Unknown]
```

## Integration

- **Prerequisite**: `codebase-mapping` to identify type files
- **Feeds into**: `comparative-matrix` for typing decisions
- **Related**: `resilience-analysis` for error handling in serialization

Related Skills

log-analysis

242

from aiskillstore/marketplace

Analyze application logs to identify errors, performance issues, and security anomalies. Use when debugging issues, monitoring system health, or investigating incidents. Handles various log formats including Apache, Nginx, application logs, and JSON logs.

wireshark-network-traffic-analysis

242

from aiskillstore/marketplace

This skill should be used when the user asks to "analyze network traffic with Wireshark", "capture packets for troubleshooting", "filter PCAP files", "follow TCP/UDP streams", "detect network anomalies", "investigate suspicious traffic", or "perform protocol analysis". It provides comprehensive techniques for network packet capture, filtering, and analysis using Wireshark.

wireshark-analysis

242

from aiskillstore/marketplace

This skill should be used when the user asks to "analyze network traffic with Wireshark", "capture packets for troubleshooting", "filter PCAP files", "follow TCP/UDP streams", "dete...

vector-database-engineer

242

from aiskillstore/marketplace

Expert in vector databases, embedding strategies, and semantic search implementation. Masters Pinecone, Weaviate, Qdrant, Milvus, and pgvector for RAG applications, recommendation systems, and similar

team-composition-analysis

242

from aiskillstore/marketplace

This skill should be used when the user asks to "plan team structure", "determine hiring needs", "design org chart", "calculate compensation", "plan equity allocation", or requests organizational design and headcount planning for a startup.

stride-analysis-patterns

242

from aiskillstore/marketplace

Apply STRIDE methodology to systematically identify threats. Use when analyzing system security, conducting threat modeling sessions, or creating security documentation.

sqlmap-database-pentesting

242

from aiskillstore/marketplace

This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns...

sqlmap-database-penetration-testing

242

from aiskillstore/marketplace

This skill should be used when the user asks to "automate SQL injection testing," "enumerate database structure," "extract database credentials using sqlmap," "dump tables and columns from a vulnerable database," or "perform automated database penetration testing." It provides comprehensive guidance for using SQLMap to detect and exploit SQL injection vulnerabilities.

market-sizing-analysis

242

from aiskillstore/marketplace

This skill should be used when the user asks to "calculate TAM", "determine SAM", "estimate SOM", "size the market", "calculate market opportunity", "what's the total addressable market", or requests market sizing analysis for a startup or business opportunity.

gdpr-data-handling

242

from aiskillstore/marketplace

Implement GDPR-compliant data handling with consent management, data subject rights, and privacy by design. Use when building systems that process EU personal data, implementing privacy controls, or conducting GDPR compliance reviews.

error-diagnostics-error-analysis

242

from aiskillstore/marketplace

You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.

error-debugging-error-analysis

242

from aiskillstore/marketplace

You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.