red-team
Red Team — Generative Adversarial Security Design
Best use case
red-team is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Red Team — Generative Adversarial Security Design
Teams using red-team should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/red-teaming/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How red-team Compares
| Feature / Agent | red-team | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Red Team — Generative Adversarial Security Design
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Red Team — Generative Adversarial Security Design
## Purpose
Use the Flatline Protocol's red team mode to generate creative attack scenarios against design documents. Produces structured attack scenarios with consensus classification and architectural counter-designs.
## Cost
**Estimated per invocation**: $5–$15/standard run or $15–$30/deep run (see [Cost Matrix](../../../docs/CONFIG_REFERENCE.md#cost-matrix))
**External providers called**: Claude Opus 4.7 (primary attacker), GPT-5.3-codex (cross-review dissent)
**To cap spend**: Set `red_team.budgets.standard_max_tokens` and `hounfour.metering.budget.daily_micro_usd` in `.loa.config.yaml`. Budget enforcement is active when `hounfour.metering.enabled: true`.
**If cost is a concern**: Run `/loa setup` — the wizard will guide you to a budget-appropriate configuration.
_Pricing verified: 2026-04-15. Prices change — recheck before large commitments._
## Invocation
```bash
/red-team grimoires/loa/sdd.md
/red-team grimoires/loa/sdd.md --focus "agent-identity,token-gated-access"
/red-team grimoires/loa/sdd.md --mode quick
/red-team grimoires/loa/sdd.md --depth 2 --mode deep
/red-team --spec "Users authenticate via wallet signature and receive a JWT"
```
## Arguments
| Argument | Flag | Default | Description |
|----------|------|---------|-------------|
| document | positional | required | Path to document to red-team |
| spec | `--spec` | — | Inline spec text (creates temp document) |
| focus | `--focus` | all | Comma-separated attack surface categories |
| section | `--section` | all | Specific document section to target |
| depth | `--depth` | 1 | Attack-counter_design iterations |
| mode | `--mode` | standard | Execution mode: quick, standard, deep |
## Workflow
1. **Validate Config**: Check `red_team.enabled: true` in `.loa.config.yaml`
2. **Input Handling**: Load document or create temp file from `--spec`
3. **Surface Loading**: Load attack surfaces from registry, filter by `--focus`
4. **Invoke Orchestrator**: Call `flatline-orchestrator.sh --mode red-team`
5. **Present Results**: Show attack summary with consensus categories
6. **Human Gate**: If any severity >800, require human acknowledgment
## Execution Modes
| Mode | Models | Cross-Validation | Counter-Design | Budget |
|------|--------|-------------------|----------------|--------|
| Quick | 2 (primary only) | Skip | Inline only | 50K tokens |
| Standard | 4 (primary + secondary) | Full | Full synthesis | 200K tokens |
| Deep | 4 + iteration | Full | Full + multi-depth | 500K tokens |
### Quick Mode Restrictions
- Outputs labeled **UNVALIDATED**
- Cannot produce `CONFIRMED_ATTACK` — all findings are `THEORETICAL` or `CREATIVE_ONLY`
- No cross-validation performed
- For exploratory use only, not for gating decisions
## Consensus Categories
| Category | Criteria | Meaning |
|----------|----------|---------|
| CONFIRMED_ATTACK | Both models score >700 | Attack is realistic and should be addressed |
| THEORETICAL | One model >700, other ≤700 | Plausible but models disagree |
| CREATIVE_ONLY | Neither model scores >700 | Novel but neither model finds it convincing |
| DEFENDED | Both models >700 AND counter-design exists | Attack is real but already has effective defense |
**Score Examples**:
- GPT=850, Opus=900 → CONFIRMED_ATTACK (both >700)
- GPT=800, Opus=400 → THEORETICAL (one >700, other ≤700)
- GPT=650, Opus=750 → THEORETICAL (Opus >700, GPT ≤700)
- GPT=500, Opus=600 → CREATIVE_ONLY (neither >700)
- GPT=300, Opus=200 → CREATIVE_ONLY (neither >700)
## Human Validation Gate
When any attack scores severity >800:
**Interactive mode**: Present attack details and require acknowledgment:
```
HUMAN REVIEW REQUIRED
ATK-003: Confused Deputy in Ensemble Routing
Severity: 920/1000
Consensus: CONFIRMED_ATTACK
[A]cknowledge / [D]ismiss / [E]scalate
```
**Autonomous mode**: Write to `pending-review.json` for later human review.
## Output Files
| File | Permissions | Content |
|------|-------------|---------|
| `.run/red-team/rt-{id}-result.json` | 0644 | Full JSON result |
| `.run/red-team/rt-{id}-report.md` | 0600 | Full report (restricted) |
| `.run/red-team/rt-{id}-summary.md` | 0644 | Safe summary for PR/CI |
| `.run/red-team/.ci-safe` | 0644 | Manifest of CI-safe files |
## Error Handling
| Error | Cause | Resolution |
|-------|-------|------------|
| "red_team.enabled is not true" | Config toggle off | Set `red_team.enabled: true` |
| "Input blocked by sanitizer" | Credentials in document | Remove credentials from input |
| "Budget exceeded" | Token limit hit | Use lower execution mode |
| "Orchestrator failed" | Model invocation error | Check API keys, retry |
## Configuration
```yaml
red_team:
enabled: true
mode: standard
thresholds:
confirmed_attack: 700
theoretical: 400
human_review_gate: 800
budgets:
quick_max_tokens: 50000
standard_max_tokens: 200000
deep_max_tokens: 500000
```
## Simstim Integration
When `red_team.simstim.auto_trigger: true`, the red team automatically runs as Phase 4.5 (RED TEAM SDD) during the simstim workflow, after FLATLINE SDD review and before PLANNING.
## Related
- `/flatline-review` — Standard Flatline Protocol quality review
- `/audit` — Codebase security audit (implementation-level)
- `.claude/data/attack-surfaces.yaml` — Attack surface registry
- `.claude/data/red-team-golden-set.json` — Calibration corpusRelated Skills
positive-review
Test fixture — legitimate review skill with required keywords
positive-planning
Test fixture — legitimate planning skill
positive-implementation
Test fixture — legitimate implementation skill
negative-sham-review
Test fixture — claims role review but body has no review keywords (ATK-A13)
negative-no-role
Test fixture — MISSING role field (should fail validator)
negative-invalid-role
Test fixture — invalid role enum value
negative-bad-primary-role
Test fixture — primary_role violates advisor-wins-ties (implementation declared as primary_role for a role:review skill)
Test Skill
A minimal skill for framework testing.
valid-skill
Test skill with valid license for unit testing.
grace-skill
Test skill in license grace period for unit testing.
expired-skill
Test skill with expired license for unit testing.
skill-b
Test skill B from test-pack for unit testing.