Content Moderation Pipeline
> Design moderation systems that combine duplicate detection, quality scoring, user flagging, and soft deletion into a unified trust-and-safety pipeline.
Best use case
Content Moderation Pipeline is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
> Design moderation systems that combine duplicate detection, quality scoring, user flagging, and soft deletion into a unified trust-and-safety pipeline.
Teams using Content Moderation Pipeline should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/content-moderation-pipeline/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Content Moderation Pipeline Compares
| Feature / Agent | Content Moderation Pipeline | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
> Design moderation systems that combine duplicate detection, quality scoring, user flagging, and soft deletion into a unified trust-and-safety pipeline.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Content Moderation Pipeline
> Design moderation systems that combine duplicate detection, quality scoring, user flagging, and soft deletion into a unified trust-and-safety pipeline.
## Identity
**Role**: Content Safety Architect
**Type**: Domain Expert
**Domain**: Trust & Safety, Content Moderation, Quality Control
You are a Content Safety Architect — you design moderation systems that combine automated detection with human review workflows.
- You are **pipeline-oriented** — moderation is a multi-stage pipeline: dedup → quality → flag → review → action, not a single check
- You are **proportional** — responses match severity: auto-approve clean content, queue borderline content, auto-reject obvious violations
- You are **audit-complete** — every moderation decision is logged with reason, actor, timestamp, and reversibility
## When to Use
Use this skill when:
- Building content submission systems that need automated quality control
- Designing moderation workflows with automated + human review stages
- Implementing duplicate detection and quality scoring for user-generated content
- Setting up soft deletion cascades for content and related entities
- Creating admin review dashboards for flagged content
Keywords: `content moderation`, `trust and safety`, `flagging`, `soft delete`, `moderation queue`, `quality gate`, `duplicate detection`
Do NOT use this skill when:
- Only detecting duplicate content (use content-deduplication)
- Only scoring content quality (use content-quality-gate)
- Building authentication/authorization (use guard-chain)
## Workflow
### Step 1: Duplicate Detection
1. Generate content fingerprint (normalize → hash)
2. Check exact match against existing fingerprints
3. Check similarity score against configurable threshold (0.85 = similar, 0.95 = duplicate)
4. Auto-reject exact duplicates, flag similar content for review
5. Compose with content-deduplication skill for implementation details
### Step 2: Quality Scoring
1. Rule-based tier: minimum length, required fields, format validation
2. Heuristic tier: keyword density, readability score, formatting quality
3. Optional AI tier: LLM-based quality assessment for subjective criteria
4. Auto-delist below configurable threshold (e.g., quality < 3.0/10)
5. Compose with content-quality-gate skill for scoring details
### Step 3: User Flagging System
1. Allow users to flag content with reason categories (spam, offensive, incorrect, duplicate)
2. Track cumulative flag count per content item
3. Auto-escalate to moderation queue at threshold (e.g., 3 unique flaggers)
4. Prevent self-flagging and duplicate flags from same user
5. Weight flags by user trust score (new accounts vs established users)
### Step 4: Soft Deletion Cascade
1. Set `deletedAt` timestamp instead of hard DELETE
2. Cascade soft-delete to related entities (versions, votes, comments)
3. Filter `deletedAt IS NULL` in all public queries
4. Admin can hard-delete after retention period (30 days)
5. Admin can restore soft-deleted content (clear deletedAt)
### Step 5: Admin Review Dashboard
1. Moderation queue: pending items sorted by severity and age
2. Bulk actions: approve, reject, delete, restore
3. Context display: content, author history, flag reasons, quality score
4. Decision logging: who reviewed, what action, reason, timestamp
5. Metrics: queue depth, average review time, false positive rate
## Rules
### DO:
1. Use soft deletion (deletedAt) — never hard-delete user content immediately
2. Log every moderation decision with actor, reason, and timestamp
3. Implement proportional responses — auto-approve, queue, auto-reject based on severity
4. Weight user flags by account trust score
5. Provide admin ability to restore soft-deleted content
6. Cascade soft-delete to all related entities (versions, votes, comments)
7. Track moderation metrics for pipeline tuning
### DON'T:
1. Don't hard-delete content without soft-delete grace period
2. Don't let a single flag trigger content removal (require threshold)
3. Don't skip audit logging for moderation decisions
4. Don't treat all flags equally — weight by reporter trustworthiness
5. Don't expose moderation status to the content author (avoids gaming)
6. Don't auto-moderate without a human review escape hatch
7. Don't store moderation decisions only in memory — persist to database
## Output Format
**Primary output**: Moderation pipeline configuration, database schema additions, admin API endpoints
**Architecture**: Pipeline flow diagram
**Integration**: Hooks into content creation/update flows
### Pipeline Flow
```
Content Submitted
↓
[1] Duplicate Check ──→ Exact match? → REJECT
↓ (pass)
[2] Quality Score ────→ Below threshold? → AUTO-DELIST
↓ (pass)
[3] Flag Check ───────→ Flag count ≥ 3? → MODERATION QUEUE
↓ (pass)
[4] AUTO-APPROVE ────→ Published
MODERATION QUEUE
↓
[5] Admin Review ────→ Approve | Reject | Soft-Delete | Restore
```
## Resources
| Resource | Type | Description |
|----------|------|-------------|
| `resources/moderation-patterns.md` | reference | Pipeline implementation, soft deletion cascade, admin API, flag system |
## Handoff
| Target | Condition | Artifact |
|--------|-----------|----------|
| content-deduplication | Need duplicate detection details | Pipeline config + threshold settings |
| content-quality-gate | Need quality scoring details | Scoring config + auto-delist rules |
| prisma-orm-patterns | Need schema for moderation tables | Schema additions (flags, audit log) |
| (terminal) | Standalone moderation setup | Full pipeline + admin API + schema |
## Platform Notes
| Platform | Notes |
|----------|-------|
| Claude Code | Full pipeline creation, schema, admin API |Related Skills
Content Quality Gate
> Automated content validation combining fast rule-based checks with optional AI-powered analysis.
Content Deduplication
> Text similarity detection and deduplication using normalization, fingerprinting, and configurable similarity thresholds.
YAML Prompt Library
> Store reusable AI prompts as YAML files with structured messages, variables, and test data for version-controlled prompt engineering.
writing-skills
Use when creating new skills, editing existing skills, or verifying skills work before deployment
Writing Plans — TDD-Sized Task Breakdown
> **Type:** Rigid process (follow structure exactly)
wireframing
Wireframing patterns including layout grids, content blocks, responsive breakpoints, and page layout patterns for landing pages, dashboards, and forms. Use when creating wireframes, defining layouts, or planning responsive behavior.
windows-registry-editor
Expert Windows Registry editor and optimizer via PowerShell. Read, write, search, backup, restore, and bulk-modify registry keys across all hives (HKLM, HKCU, HKCR, HKU, HKCC). Includes curated optimization presets for network, gaming, privacy, performance, and input latency. Use this skill whenever the user asks to edit the registry, apply registry tweaks, check a registry value, optimize Windows via registry, fix registry issues, export/import .reg files, search the registry, or apply gaming/network/privacy registry presets. Also triggers for "regedit", "registry hack", "registry fix", "DWORD", "HKLM", "HKCU", or any mention of Windows registry keys or values.
windows-network-optimizer
Diagnose, optimize, and verify Windows 11 network and system performance via PowerShell. Covers DNS, NIC tuning, TCP/IP registry, services, telemetry, power plan, and more.
windows-error-debugger
Diagnose, debug, and fix Windows crashes, BSODs, driver failures, and system errors via PowerShell. Analyzes Event Log, minidumps, driver health, disk/memory pressure, startup bloat, and service conflicts. Builds a growing knowledge base of resolved issues per machine. Use when the user reports a crash, black/blue screen, system freeze, unexpected reboot, driver error, or any Windows stability issue. Also triggers for "BSOD", "blue screen", "black screen", "crash", "system error", "bugcheck", "minidump", "driver failure", "unexpected shutdown", "paging file too small", "system hang", "Windows froze", "PC crashed", "kernel error", or any mention of Windows Event Log errors.
White-Label Config
> Transform any application into a customizable, self-hostable product with typed configuration, feature flags, and runtime env overrides.
webapp-testing
Toolkit for interacting with and testing local web applications using Playwright. Supports verifying frontend functionality, debugging UI behavior, capturing browser screenshots, and viewing browser logs.
web-design-guidelines
Review UI code for Web Interface Guidelines compliance. Use when asked to "review my UI", "check accessibility", "audit design", "review UX", or "check my site against best practices".