AI Agent Skill HUB

Codex

voice-analyze

Reverse-engineer voice profiles from sample content by analyzing writing patterns

104 stars

View on GitHub Installation ↓

Best use case

voice-analyze is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

It is a strong fit for teams already working in Codex.

Reverse-engineer voice profiles from sample content by analyzing writing patterns

Teams using voice-analyze should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/voice-analyze/SKILL.md --create-dirs "https://raw.githubusercontent.com/jmagly/aiwg/main/.agents/skills/voice-analyze/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/voice-analyze/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How voice-analyze Compares

Feature / Agent	voice-analyze	Standard Approach
Platform Support	Codex	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Reverse-engineer voice profiles from sample content by analyzing writing patterns

Which AI agents support this skill?

This skill is designed for Codex.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# voice-analyze

Reverse-engineer voice profiles from sample content by analyzing writing patterns.

## Triggers


Alternate expressions and non-obvious activations (primary phrases are matched automatically from the skill description):

- "voice fingerprint" → voice profile extraction
- "what's my writing style" → voice analysis

## Behavior

When triggered, this skill:

1. **Analyzes text samples** for:
   - Sentence structure and length patterns
   - Vocabulary sophistication and domain
   - Tone markers (formality, confidence, warmth)
   - Structural patterns (lists, examples, questions)
   - Perspective and voice choices

2. **Extracts measurable features**:
   - Average sentence length
   - Vocabulary complexity (syllables, word length)
   - Contraction usage
   - Personal pronoun frequency
   - Question density
   - List/bullet usage

3. **Maps features to voice dimensions**:
   - Statistical analysis → tone scale values (0-1)
   - Pattern detection → structure preferences
   - Vocabulary extraction → prefer/avoid lists

4. **Generates voice profile** matching the analyzed style

## Usage Examples

### Analyze Existing Documentation
```
User: "Analyze this writing style" + [paste technical docs]

Analysis:
- Formality: 0.7 (no contractions, structured sentences)
- Confidence: 0.85 (direct statements, few hedges)
- Warmth: 0.25 (impersonal, third-person)
- Complexity: 0.8 (technical vocabulary, long sentences)

Output: analyzed-technical-docs.yaml
```

### Match Brand Voice
```
User: "Extract voice from our marketing copy" + [paste samples]

Analysis:
- Formality: 0.3 (conversational, contractions)
- Confidence: 0.7 (benefit claims, but some hedging)
- Warmth: 0.85 (second person, friendly tone)
- Energy: 0.8 (exclamation points, action verbs)

Output: brand-marketing-voice.yaml
```

### Capture Personal Style
```
User: "Create profile from my blog posts" + [paste samples]

Analysis:
- Identifies personal writing quirks
- Extracts signature phrases
- Maps to voice dimensions

Output: personal-blog-voice.yaml
```

## Analysis Methodology

### Feature Extraction

| Feature | Measurement | Maps To |
|---------|-------------|---------|
| Sentence length | Avg words/sentence | complexity |
| Contractions | Frequency per 100 words | formality (inverse) |
| First person ("I", "we") | Frequency | warmth |
| Second person ("you") | Frequency | warmth |
| Passive voice | Percentage of sentences | confidence (inverse) |
| Questions | Per paragraph | warmth, engagement |
| Hedging words | "might", "perhaps", "could" | confidence (inverse) |
| Exclamation marks | Frequency | energy |
| Technical terms | Domain vocabulary density | complexity |

### Dimension Calibration

**Formality** (0-1):
- 0.0-0.3: Contractions frequent, casual language, fragments okay
- 0.4-0.6: Mixed style, professional but accessible
- 0.7-1.0: No contractions, complete sentences, formal structure

**Confidence** (0-1):
- 0.0-0.3: Many hedges ("might", "perhaps"), questions, qualifiers
- 0.4-0.6: Balanced certainty, occasional hedges
- 0.7-1.0: Direct statements, conclusions first, few qualifiers

**Warmth** (0-1):
- 0.0-0.3: Third person, passive voice, clinical tone
- 0.4-0.6: Professional but personable
- 0.7-1.0: Second person, inclusive language, empathetic

**Energy** (0-1):
- 0.0-0.3: Calm, measured, understated
- 0.4-0.6: Balanced engagement
- 0.7-1.0: Exclamation marks, action verbs, dynamic phrasing

**Complexity** (0-1):
- 0.0-0.3: Short sentences, simple vocabulary, accessible
- 0.4-0.6: Moderate complexity, clear but nuanced
- 0.7-1.0: Long sentences, technical vocabulary, layered ideas

### Vocabulary Extraction

**Signature phrases** - Identified by:
- Repeated patterns across samples
- Distinctive constructions
- Opening/closing patterns

**Domain vocabulary** - Extracted by:
- Technical term frequency
- Specialized jargon
- Industry-specific language

**Avoid patterns** - Detected by:
- Conspicuous absence of common phrases
- Consistent avoidance of certain constructions

## Output Format

```yaml
name: analyzed-sample-voice
version: 1.0.0
description: Voice profile extracted from sample content
analysis_source:
  sample_size: 1500  # words analyzed
  sample_count: 3    # number of samples
  confidence: 0.85   # analysis confidence score
tone:
  formality: 0.65
  confidence: 0.8
  warmth: 0.4
  energy: 0.5
  complexity: 0.7
vocabulary:
  prefer:
    - "extracted signature phrase 1"
    - "detected domain terminology"
  avoid:
    - "patterns not found in samples"
  signature_phrases:
    - "The key point is..."
    - "This demonstrates..."
structure:
  sentence_length: medium    # avg 15-20 words
  paragraph_length: medium   # avg 4-6 sentences
  sentence_variety: high     # varied structure detected
  use_lists: when-appropriate
  use_examples: frequently
  use_questions: rarely
perspective:
  person: third
  voice: active
  tense: present
extracted_patterns:
  opening_style: "context-first"
  closing_style: "conclusion-summary"
  transition_style: "logical-flow"
```

## CLI Usage

```bash
# Analyze from file
python voice_analyzer.py --input sample.txt

# Analyze from multiple files
python voice_analyzer.py --input "sample1.txt,sample2.txt,sample3.txt"

# Analyze from stdin (pipe content)
cat sample.txt | python voice_analyzer.py --stdin

# Specify output name
python voice_analyzer.py --input sample.txt --name my-extracted-voice

# Output to specific directory
python voice_analyzer.py --input sample.txt --output .aiwg/voices/

# JSON output for inspection
python voice_analyzer.py --input sample.txt --json
```

## Integration

- **Output**: Creates profiles usable by `voice-apply`
- **Chain**: `voice-analyze` → `voice-create` (to refine) → `voice-apply`
- **Chain**: `voice-analyze` + `voice-analyze` → `voice-blend` (combine styles)

## Accuracy Considerations

- **Minimum sample**: 500+ words for reliable analysis
- **Multiple samples**: 3+ samples improve accuracy
- **Consistent genre**: Mixing genres reduces accuracy
- **Confidence score**: Output includes analysis confidence (0-1)

## References

- Schema: `../../../schemas/voice-profile.schema.json`
- Dimensions guide: `../voice-apply/references/voice-dimensions.md`
- Generator: `../voice-create/scripts/voice_generator.py`

Related Skills

repo-analyzer

from jmagly/aiwg

Analyze GitHub repositories for structure, documentation, dependencies, and contribution patterns. Use for codebase understanding and health assessment.

voice-to-soul

from jmagly/aiwg

Generate a SOUL.md from an existing AIWG voice profile

voice-create

from jmagly/aiwg

Generate custom voice profiles from natural language descriptions by mapping tone, formality, and domain to voice dimensions

voice-blend

from jmagly/aiwg

Combine multiple voice profiles with weighted mixing to create hybrid voices

voice-apply

from jmagly/aiwg

Applies a voice profile to transform content. Use when user asks to write in a specific voice, match a tone, apply a style, or transform content to sound like a particular voice profile.

soul-to-voice

from jmagly/aiwg

Generate an AIWG voice profile from an existing SOUL.md identity file

roko-voice

from jmagly/aiwg

Transform standard technical content into ROKO voice — dense technical mythology wrapped in cypherpunk cultural narrative

retrospective-analyzer

from jmagly/aiwg

Analyze team retrospectives for insights

analyze-artist

from jmagly/aiwg

Analyze an artist's discography to identify eras, catalog structure, and collection plan

aiwg-orchestrate

from jmagly/aiwg

Route structured artifact work to AIWG workflows via MCP with zero parent context cost

venv-manager

from jmagly/aiwg

Create, manage, and validate Python virtual environments. Use for project isolation and dependency management.

pytest-runner

from jmagly/aiwg

Execute Python tests with pytest, supporting fixtures, markers, coverage, and parallel execution. Use for Python test automation.