voice-analyze
Reverse-engineer voice profiles from sample content by analyzing writing patterns
Best use case
voice-analyze is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
It is a strong fit for teams already working in Codex.
Reverse-engineer voice profiles from sample content by analyzing writing patterns
Teams using voice-analyze should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/voice-analyze/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How voice-analyze Compares
| Feature / Agent | voice-analyze | Standard Approach |
|---|---|---|
| Platform Support | Codex | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Reverse-engineer voice profiles from sample content by analyzing writing patterns
Which AI agents support this skill?
This skill is designed for Codex.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
SKILL.md Source
# voice-analyze
Reverse-engineer voice profiles from sample content by analyzing writing patterns.
## Triggers
Alternate expressions and non-obvious activations (primary phrases are matched automatically from the skill description):
- "voice fingerprint" → voice profile extraction
- "what's my writing style" → voice analysis
## Behavior
When triggered, this skill:
1. **Analyzes text samples** for:
- Sentence structure and length patterns
- Vocabulary sophistication and domain
- Tone markers (formality, confidence, warmth)
- Structural patterns (lists, examples, questions)
- Perspective and voice choices
2. **Extracts measurable features**:
- Average sentence length
- Vocabulary complexity (syllables, word length)
- Contraction usage
- Personal pronoun frequency
- Question density
- List/bullet usage
3. **Maps features to voice dimensions**:
- Statistical analysis → tone scale values (0-1)
- Pattern detection → structure preferences
- Vocabulary extraction → prefer/avoid lists
4. **Generates voice profile** matching the analyzed style
## Usage Examples
### Analyze Existing Documentation
```
User: "Analyze this writing style" + [paste technical docs]
Analysis:
- Formality: 0.7 (no contractions, structured sentences)
- Confidence: 0.85 (direct statements, few hedges)
- Warmth: 0.25 (impersonal, third-person)
- Complexity: 0.8 (technical vocabulary, long sentences)
Output: analyzed-technical-docs.yaml
```
### Match Brand Voice
```
User: "Extract voice from our marketing copy" + [paste samples]
Analysis:
- Formality: 0.3 (conversational, contractions)
- Confidence: 0.7 (benefit claims, but some hedging)
- Warmth: 0.85 (second person, friendly tone)
- Energy: 0.8 (exclamation points, action verbs)
Output: brand-marketing-voice.yaml
```
### Capture Personal Style
```
User: "Create profile from my blog posts" + [paste samples]
Analysis:
- Identifies personal writing quirks
- Extracts signature phrases
- Maps to voice dimensions
Output: personal-blog-voice.yaml
```
## Analysis Methodology
### Feature Extraction
| Feature | Measurement | Maps To |
|---------|-------------|---------|
| Sentence length | Avg words/sentence | complexity |
| Contractions | Frequency per 100 words | formality (inverse) |
| First person ("I", "we") | Frequency | warmth |
| Second person ("you") | Frequency | warmth |
| Passive voice | Percentage of sentences | confidence (inverse) |
| Questions | Per paragraph | warmth, engagement |
| Hedging words | "might", "perhaps", "could" | confidence (inverse) |
| Exclamation marks | Frequency | energy |
| Technical terms | Domain vocabulary density | complexity |
### Dimension Calibration
**Formality** (0-1):
- 0.0-0.3: Contractions frequent, casual language, fragments okay
- 0.4-0.6: Mixed style, professional but accessible
- 0.7-1.0: No contractions, complete sentences, formal structure
**Confidence** (0-1):
- 0.0-0.3: Many hedges ("might", "perhaps"), questions, qualifiers
- 0.4-0.6: Balanced certainty, occasional hedges
- 0.7-1.0: Direct statements, conclusions first, few qualifiers
**Warmth** (0-1):
- 0.0-0.3: Third person, passive voice, clinical tone
- 0.4-0.6: Professional but personable
- 0.7-1.0: Second person, inclusive language, empathetic
**Energy** (0-1):
- 0.0-0.3: Calm, measured, understated
- 0.4-0.6: Balanced engagement
- 0.7-1.0: Exclamation marks, action verbs, dynamic phrasing
**Complexity** (0-1):
- 0.0-0.3: Short sentences, simple vocabulary, accessible
- 0.4-0.6: Moderate complexity, clear but nuanced
- 0.7-1.0: Long sentences, technical vocabulary, layered ideas
### Vocabulary Extraction
**Signature phrases** - Identified by:
- Repeated patterns across samples
- Distinctive constructions
- Opening/closing patterns
**Domain vocabulary** - Extracted by:
- Technical term frequency
- Specialized jargon
- Industry-specific language
**Avoid patterns** - Detected by:
- Conspicuous absence of common phrases
- Consistent avoidance of certain constructions
## Output Format
```yaml
name: analyzed-sample-voice
version: 1.0.0
description: Voice profile extracted from sample content
analysis_source:
sample_size: 1500 # words analyzed
sample_count: 3 # number of samples
confidence: 0.85 # analysis confidence score
tone:
formality: 0.65
confidence: 0.8
warmth: 0.4
energy: 0.5
complexity: 0.7
vocabulary:
prefer:
- "extracted signature phrase 1"
- "detected domain terminology"
avoid:
- "patterns not found in samples"
signature_phrases:
- "The key point is..."
- "This demonstrates..."
structure:
sentence_length: medium # avg 15-20 words
paragraph_length: medium # avg 4-6 sentences
sentence_variety: high # varied structure detected
use_lists: when-appropriate
use_examples: frequently
use_questions: rarely
perspective:
person: third
voice: active
tense: present
extracted_patterns:
opening_style: "context-first"
closing_style: "conclusion-summary"
transition_style: "logical-flow"
```
## CLI Usage
```bash
# Analyze from file
python voice_analyzer.py --input sample.txt
# Analyze from multiple files
python voice_analyzer.py --input "sample1.txt,sample2.txt,sample3.txt"
# Analyze from stdin (pipe content)
cat sample.txt | python voice_analyzer.py --stdin
# Specify output name
python voice_analyzer.py --input sample.txt --name my-extracted-voice
# Output to specific directory
python voice_analyzer.py --input sample.txt --output .aiwg/voices/
# JSON output for inspection
python voice_analyzer.py --input sample.txt --json
```
## Integration
- **Output**: Creates profiles usable by `voice-apply`
- **Chain**: `voice-analyze` → `voice-create` (to refine) → `voice-apply`
- **Chain**: `voice-analyze` + `voice-analyze` → `voice-blend` (combine styles)
## Accuracy Considerations
- **Minimum sample**: 500+ words for reliable analysis
- **Multiple samples**: 3+ samples improve accuracy
- **Consistent genre**: Mixing genres reduces accuracy
- **Confidence score**: Output includes analysis confidence (0-1)
## References
- Schema: `../../../schemas/voice-profile.schema.json`
- Dimensions guide: `../voice-apply/references/voice-dimensions.md`
- Generator: `../voice-create/scripts/voice_generator.py`Related Skills
repo-analyzer
Analyze GitHub repositories for structure, documentation, dependencies, and contribution patterns. Use for codebase understanding and health assessment.
voice-to-soul
Generate a SOUL.md from an existing AIWG voice profile
voice-create
Generate custom voice profiles from natural language descriptions by mapping tone, formality, and domain to voice dimensions
voice-blend
Combine multiple voice profiles with weighted mixing to create hybrid voices
voice-apply
Applies a voice profile to transform content. Use when user asks to write in a specific voice, match a tone, apply a style, or transform content to sound like a particular voice profile.
soul-to-voice
Generate an AIWG voice profile from an existing SOUL.md identity file
roko-voice
Transform standard technical content into ROKO voice — dense technical mythology wrapped in cypherpunk cultural narrative
retrospective-analyzer
Analyze team retrospectives for insights
analyze-artist
Analyze an artist's discography to identify eras, catalog structure, and collection plan
aiwg-orchestrate
Route structured artifact work to AIWG workflows via MCP with zero parent context cost
venv-manager
Create, manage, and validate Python virtual environments. Use for project isolation and dependency management.
pytest-runner
Execute Python tests with pytest, supporting fixtures, markers, coverage, and parallel execution. Use for Python test automation.