ai-writing-detection

Comprehensive AI writing detection patterns and methodology. Provides vocabulary lists, structural patterns, model-specific fingerprints, and false positive prevention guidance. Use when analyzing text for AI authorship or understanding detection patterns.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

ai-writing-detection is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using ai-writing-detection should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/ai-writing-detection/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/communication/ai-writing-detection/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/ai-writing-detection/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How ai-writing-detection Compares

Feature / Agent	ai-writing-detection	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AI Writing Detection Reference

Expert-level knowledge base for detecting AI-generated text, compiled from academic research, commercial detection tools, and empirical analysis.

## Quick Reference: High-Confidence Signals

These indicators strongly suggest AI authorship when found together:

### Vocabulary Red Flags
**High-signal words** (50-700x more common in AI text):
- "delve", "tapestry", "nuanced", "multifaceted", "underscore"
- "intricate interplay", "played a crucial role", "complex and multifaceted"
- "paramount", "pivotal", "meticulous", "holistic", "robust"
- "stands/serves as", "marking a pivotal moment", "underscores its importance"

**Overused phrases**:
- "It's important to note that..."
- "In today's fast-paced world..."
- "At its core..."
- "Without further ado..."
- "Let me explain..."

See [reference/vocabulary-patterns.md](reference/vocabulary-patterns.md) for complete lists.

### Structural Red Flags
- **Uniform sentence lengths**: 12-18 words consistently (low burstiness)
- **Tricolon structures**: "research, collaboration, and problem-solving"
- **Em dash overuse**: AI uses em dashes in a formulaic way to mimic "punched up" sales writing, especially in parallelisms ("it's not X — it's Y"); swapping punctuation doesn't fix the underlying emphasis pattern
- **Perfect paragraph uniformity**: All paragraphs same approximate length
- **Template conclusions**: "In summary...", "In conclusion..."
- **Negative parallelisms**: "It's not about X; it's about Y"
- **Elegant variation**: Cycling through synonyms to avoid repetition
- **False ranges**: "From X to Y" with incoherent endpoints

See [reference/structural-patterns.md](reference/structural-patterns.md) for details.

### Content Red Flags
- **Importance puffery**: "marking a pivotal moment in history"
- **Ecosystem/conservation claims** without citations
- **"Challenges and Future" sections** following rigid formula
- **Promotional language**: "nestled in", "stunning natural beauty", "boasts"
- **Superficial analyses**: "-ing" phrases attributing significance to facts

See [reference/content-patterns.md](reference/content-patterns.md) for details.

### Formatting Red Flags
- **Title Case** in all section headings
- **Excessive boldface** (every key term bolded)
- **Inline-header lists**: `**Bold Header**: description` pattern
- **Emojis** in formal content or headings
- **Subject lines** in non-email contexts

See [reference/formatting-patterns.md](reference/formatting-patterns.md) for details.

### Markup Red Flags (Definitive)
- **turn0search0, turn0image0**: ChatGPT reference markers
- **contentReference[oaicite:]**: ChatGPT reference bugs
- **utm_source=chatgpt.com**: URL tracking (definitive)
- **Markdown in wikitext**: ## headers, **bold**, [text](url)
- **grok_card XML tags**: Grok/X specific

See [reference/markup-artifacts.md](reference/markup-artifacts.md) for details.

### Citation Red Flags
- **Broken external links** that never existed (no archive)
- **Invalid DOIs/ISBNs**: Checksum failures
- **Declared but unused references**: Cite errors
- **Placeholder values**: `url=URL`, `date=2025-XX-XX`

See [reference/citation-patterns.md](reference/citation-patterns.md) for details.

### Tone Red Flags
- Passive and detached voice throughout
- Absence of first-person pronouns where expected
- Consistent formality with no stylistic variation
- Over-politeness and excessive hedging

## Detection Methodology

### Multi-Layer Analysis Approach

**Layer 1: Technical Artifact Scan (Definitive)**
- Check for turn0search/oaicite markers (ChatGPT)
- Check for utm_source=chatgpt.com in URLs
- Check for grok_card tags (Grok)
- Check for Markdown in non-Markdown contexts
- If found: Definitive AI involvement

**Layer 2: Vocabulary Pattern Matching**
- Scan for overused AI words/phrases
- Count frequency of flagged terms
- Look for clusters of high-signal vocabulary
- Check for importance/symbolism phrases

**Layer 3: Structural Analysis**
- Observe sentence length variation (uniform = AI signal)
- Check paragraph uniformity
- Identify repetitive syntactic templates (tricolons, negative parallelisms)
- Look for elegant variation (synonym cycling)
- Check for false ranges

**Layer 4: Content Pattern Analysis**
- Check for importance puffery and promotional language
- Look for "Challenges and Future" formula
- Check for ecosystem/conservation claims without citations
- Identify superficial analyses with "-ing" attributions

**Layer 5: Citation Verification**
- Test external links - do they exist?
- Verify DOI/ISBN checksums
- Check for declared but unused references
- Look for placeholder values

**Layer 6: Formatting Analysis**
- Check heading capitalization (Title Case = signal)
- Count bold phrases per paragraph
- Look for inline-header list patterns
- Check for emojis in formal content

**Layer 7: Stylometric Observation**
- Pronoun usage patterns (missing first-person?)
- Tone consistency (too uniform = AI signal)
- Punctuation patterns (em dash overuse? curly quotes?)

**Layer 8: Coherence Check**
- Do paragraphs build a coherent argument?
- Are concepts repeated with different words?
- Do transitions actually connect ideas?

**Layer 9: Confidence Scoring**
- Weight multiple signals together
- Require corroborating evidence (3+ signals minimum)
- Apply context-specific adjustments
- Check for mitigating factors (human signals)
- Consider ineffective indicators (don't use them)

## Model-Specific Patterns

Different AI models have distinct "fingerprints":

| Model | Key Tells | Technical Artifacts |
|-------|-----------|---------------------|
| ChatGPT/GPT-4 | "delve" (pre-2025), "tapestry", tricolons, em dashes, curly quotes | turn0search, oaicite, utm_source=chatgpt.com |
| Claude | Analytical structure, extended analogies, cautious qualifications | None (uses straight quotes, no tracking) |
| Gemini | Conversational synthesis, fact-dense paragraphs | None (uses straight quotes, no tracking) |
| DeepSeek | Similar to ChatGPT, curly quotes | Curly quotation marks |
| Grok | X/Twitter integration | `<grok_card>` XML tags |
| Perplexity | Source-focused output | `[attached_file:1]`, `[web:1]` tags |

**Important dates**:
- ChatGPT launched: **November 30, 2022** (text before this is almost certainly human)
- "delve" usage dropped: **2025** (still signals pre-2025 ChatGPT)

See [reference/model-fingerprints.md](reference/model-fingerprints.md) for detailed model patterns.

## False Positive Prevention

**Critical requirements**:
- Minimum 200 words for reliable analysis
- Never flag on single indicators alone
- Use ensemble scoring (multiple signals required)

**High false-positive risk groups**:
- Non-native English speakers (61% false positive rate in research)
- Technical/formal writing
- Neurodivergent writers
- Content using grammar correction tools

**Ineffective indicators** (do NOT rely on these):
- Perfect grammar alone
- "Bland" or "robotic" prose
- "Fancy" or unusual vocabulary
- Letter-like formatting alone
- Conjunctions starting sentences

**Signs of human writing**:
- Text from before November 30, 2022
- Ability to explain editorial choices
- Personal anecdotes with verifiable details
- Minor errors and natural quirks

See [reference/false-positive-prevention.md](reference/false-positive-prevention.md) for detailed guidance.

## Analysis Output Format

Structure findings as:

```
**Overall Assessment**: [Likely AI / Possibly AI / Likely Human / Inconclusive]
**Confidence**: [Low / Medium / High]

**Summary**: 2-3 sentence overview

**Evidence Found**:
- [Category]: [Specific indicator] - "[Quote from text]"
- [Category]: [Specific indicator] - "[Quote from text]"

**Mitigating Factors**: [Elements suggesting human authorship]

**Caveats**: [Limitations, alternative explanations]
```

## Key Principles

1. **No certainty claims** - AI detection is probabilistic
2. **Multiple signals required** - Single indicators prove nothing
3. **Context matters** - Academic writing differs from blogs
4. **Stakes awareness** - False accusations cause real harm
5. **Evolving field** - Detection methods require constant updates

## Reference Files

- [vocabulary-patterns.md](reference/vocabulary-patterns.md) - Complete word/phrase lists with frequencies
- [structural-patterns.md](reference/structural-patterns.md) - Sentence, paragraph, and discourse patterns
- [content-patterns.md](reference/content-patterns.md) - Importance puffery, promotional language, content tells
- [formatting-patterns.md](reference/formatting-patterns.md) - Title case, boldface, emojis, visual patterns
- [markup-artifacts.md](reference/markup-artifacts.md) - Technical artifacts: turn0search, oaicite, Markdown, tracking
- [citation-patterns.md](reference/citation-patterns.md) - Broken links, invalid identifiers, hallucinated references
- [model-fingerprints.md](reference/model-fingerprints.md) - GPT, Claude, Gemini, Grok, Perplexity specific tells
- [false-positive-prevention.md](reference/false-positive-prevention.md) - Avoiding false accusations, ineffective indicators

## Sources

This knowledge base synthesizes research from:
- Stanford HAI (DetectGPT, bias studies)
- GPTZero, Originality.ai, Turnitin, Pangram methodologies
- Academic papers on stylometry and discourse analysis
- Empirical studies on detection accuracy and limitations
- Wikipedia:WikiProject AI Cleanup field guide (2025)
- Community-documented patterns from Wikipedia editing

Related Skills

performing-steganography-detection

from diegosouzapw/awesome-omni-skill

Detect and extract hidden data embedded in images, audio, and other media files using steganalysis tools to uncover covert communication channels.

blog-writing

from diegosouzapw/awesome-omni-skill

Write compelling blog posts with proven structure — hook openings, scannable body sections, clear CTAs. Use this skill when drafting blog posts, articles, or content marketing pieces.

writing-skills

from diegosouzapw/awesome-omni-skill

Use when creating, updating, or improving agent skills.

writing-project-technical-writing

from diegosouzapw/awesome-omni-skill

Writes technical prose (READMEs, ADRs, code comments) in the project's established human voice. Use when creating or editing .md files, writing Swift doc comments, authoring ADRs, or reviewing technical writing for voice consistency.

u09613-writing-and-rhetoric-optimization-for-household-logistics

from diegosouzapw/awesome-omni-skill

Operate the "Writing And Rhetoric Optimization for household logistics" capability in production for household logistics workflows. Use when mission execution explicitly requires this capability and outcomes must be reproducible, policy-gated, and handoff-ready.

postmortem-writing

from diegosouzapw/awesome-omni-skill

Write effective blameless postmortems with root cause analysis, timelines, and action items. Use when conducting incident reviews, writing postmortem documents, or improving incident response proce...

essay-writing

from diegosouzapw/awesome-omni-skill

Writing support (drafting, evaluation and revision) for executives and tech leaders. Support everything from internal culture-building to external technical branding.

copywriting

from diegosouzapw/awesome-omni-skill

Write persuasive copy for landing pages, emails, ads, sales pages, and marketing materials. Use when you need to write headlines, CTAs, product descriptions, ad copy, email sequences, or any text meant to drive action. Covers copywriting formulas (AIDA, PAS, FAB), headline writing, emotional triggers, objection handling in copy, and A/B testing. Trigger on "write copy", "copywriting", "landing page copy", "headline", "write a sales page", "ad copy", "email copy", "persuasive writing", "how to write [marketing text]".

Article Writing

from diegosouzapw/awesome-omni-skill

Structure and style guidance for law review articles

amazon-writing

from diegosouzapw/awesome-omni-skill

Use when writing narrative memos, 6-pagers, 1-pagers, press releases, or PRFAQs in Amazon style. Applies Amazon's no-PowerPoint writing standards with data over adjectives, active voice, and the "so what" test.

algebraic-rewriting

from diegosouzapw/awesome-omni-skill

Category-theoretic graph rewriting with DPO, SPO, and SqPO pushouts for C-Sets. Declarative transformation of acset data structures.

ai-doc-writing

from diegosouzapw/awesome-omni-skill

This skill should be used when writing, reviewing, or refactoring documentation that will be consumed as AI context. Optimizes documentation for LLM comprehension using principles of completeness, efficiency, and zero fluff—replacing prose with structured data, enforcing heading hierarchy, detecting meta-commentary, and validating that examples serve a purpose.