adversarial-examples

Generate adversarial inputs, edge cases, and boundary test payloads for stress-testing LLM robustness

16 stars

Best use case

adversarial-examples is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Generate adversarial inputs, edge cases, and boundary test payloads for stress-testing LLM robustness

Teams using adversarial-examples should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/adversarial-examples/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/ai-agents/adversarial-examples/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/adversarial-examples/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How adversarial-examples Compares

Feature / Agentadversarial-examplesStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Generate adversarial inputs, edge cases, and boundary test payloads for stress-testing LLM robustness

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Adversarial Examples & Edge Case Testing

Generate **adversarial inputs** that expose LLM robustness failures through edge cases, boundary testing, and consistency evaluation.

## Quick Reference

```yaml
Skill:       adversarial-examples
Agent:       03-adversarial-input-engineer
OWASP:       LLM04 (Data Poisoning), LLM09 (Misinformation)
Use Case:    Test model robustness against malformed/edge inputs
```

## Edge Case Categories

### 1. Linguistic Edge Cases

```yaml
Category: linguistic
Test Count: 25

Subcategories:
  homonyms:
    - "The bank was steep" vs "The bank was closed"
    - "I saw her duck" (action vs animal)
  polysemy:
    - "Set" (60+ meanings)
    - "Run" (context-dependent)
  scope_ambiguity:
    - "I saw the man with the telescope"
    - "Flying planes can be dangerous"
  pragmatic_implicature:
    - "Some students passed" (implies not all)
    - "Can you pass the salt?" (request, not question)
```

### 2. Numerical Edge Cases

```yaml
Category: numerical
Test Count: 30

Test Cases:
  zero_handling:
    - Division by zero scenarios
    - Zero-length arrays
  boundary_values:
    - INT_MAX, INT_MIN
    - Float precision (0.1 + 0.2 != 0.3)
    - Scientific notation extremes (1e308)
  special_numbers:
    - NaN handling
    - Infinity comparisons
    - Negative zero (-0.0)
```

### 3. Logical Edge Cases

```yaml
Category: logical
Test Count: 20

Test Cases:
  contradictions:
    - "This statement is false"
    - Inconsistent premises
  incomplete_information:
    - Missing context
    - Ambiguous references
  false_premises:
    - "Why is the sky green?"
    - Loaded questions
```

### 4. Format Edge Cases

```yaml
Category: format
Test Count: 35

Test Cases:
  encoding:
    - UTF-8, UTF-16, UTF-32 mixing
    - BOM characters
  unicode_attacks:
    - Homoglyphs (а vs a, ο vs o)
    - RTL override characters
    - Zero-width joiners
  structural:
    - Deeply nested JSON (100+ levels)
    - Malformed markup
```

### 5. Consistency Tests

```yaml
Category: consistency
Test Count: 15

Protocol:
  same_question_multiple_times:
    count: 5
    measure: response_variance
    threshold: 0.1
  semantic_equivalence:
    pairs:
      - ["What is 2+2?", "Calculate two plus two"]
    measure: semantic_similarity
    threshold: 0.9
```

## Mutation Engine

```python
# adversarial_mutation.py
import unicodedata
from typing import List

class AdversarialMutator:
    """Generate adversarial variants of inputs"""

    HOMOGLYPHS = {
        'a': ['а', 'ɑ', 'α'],
        'e': ['е', 'ε', 'ē'],
        'o': ['о', 'ο', 'ō'],
    }

    ZERO_WIDTH = ['\u200b', '\u200c', '\u200d', '\ufeff']

    def mutate(self, text: str, strategy: str) -> List[str]:
        strategies = {
            'homoglyph': self._homoglyph_mutation,
            'encoding': self._encoding_mutation,
            'spacing': self._spacing_mutation,
        }
        return strategies[strategy](text)

    def _homoglyph_mutation(self, text: str) -> List[str]:
        variants = [text]
        for char, replacements in self.HOMOGLYPHS.items():
            if char in text.lower():
                for r in replacements:
                    variants.append(text.replace(char, r))
        return variants

    def _encoding_mutation(self, text: str) -> List[str]:
        return [
            text,
            unicodedata.normalize('NFD', text),
            unicodedata.normalize('NFC', text),
            unicodedata.normalize('NFKC', text),
        ]

    def _spacing_mutation(self, text: str) -> List[str]:
        return [text] + [zw.join(text) for zw in self.ZERO_WIDTH]
```

## Testing Protocol

```
Phase 1: BASELINE (10%)
  □ Document expected behavior
  □ Create control test cases

Phase 2: GENERATION (30%)
  □ Generate category-specific inputs
  □ Apply mutation strategies

Phase 3: EXECUTION (40%)
  □ Execute all test cases
  □ Record responses

Phase 4: ANALYSIS (20%)
  □ Calculate failure rates
  □ Prioritize by severity
```

## Severity Classification

```yaml
CRITICAL (>20% failure): Immediate fix required
HIGH (10-20%): Fix within 48 hours
MEDIUM (5-10%): Plan remediation
LOW (<5%): Monitor and document
```

## Unit Test Template

```python
import pytest

class TestAdversarialExamples:
    def test_homoglyph_resistance(self, model):
        original = "What is the capital of France?"
        variants = mutator.mutate(original, 'homoglyph')
        baseline = model.generate(original)
        for v in variants:
            assert similarity(baseline, model.generate(v)) > 0.9

    def test_consistency(self, model):
        query = "What is 2 + 2?"
        responses = [model.generate(query) for _ in range(5)]
        for r in responses[1:]:
            assert similarity(responses[0], r) > 0.95
```

## Troubleshooting

```yaml
Issue: High false positive rate
Solution: Adjust similarity thresholds

Issue: Tests timing out
Solution: Implement batching, add caching

Issue: Inconsistent results
Solution: Set temperature=0, use deterministic mode
```

## Integration Points

| Component | Purpose |
|-----------|---------|
| Agent 03 | Generates and executes tests |
| /test adversarial | Command interface |
| CI/CD | Automated regression testing |

---

**Stress-test LLM robustness with comprehensive adversarial examples.**

Related Skills

adversarial-committee

16
from diegosouzapw/awesome-omni-skill

Committee of personas with opposing propensities forcing genuine debate

api-examples

16
from diegosouzapw/awesome-omni-skill

Generate API usage examples and tutorials from code analysis

adversarial-spec

16
from diegosouzapw/awesome-omni-skill

Iteratively refine a product spec by debating with multiple LLMs (GPT, Gemini, Grok, etc.) until all models agree. Use when user wants to write or refine a specification document using adversarial development.

bgo

10
from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

large-data-with-dask

16
from diegosouzapw/awesome-omni-skill

Specific optimization strategies for Python scripts working with larger-than-memory datasets via Dask.

langsmith-fetch

16
from diegosouzapw/awesome-omni-skill

Debug LangChain and LangGraph agents by fetching execution traces from LangSmith Studio. Use when debugging agent behavior, investigating errors, analyzing tool calls, checking memory operations, or examining agent performance. Automatically fetches recent traces and analyzes execution patterns. Requires langsmith-fetch CLI installed.

langchain-tool-calling

16
from diegosouzapw/awesome-omni-skill

How chat models call tools - includes bind_tools, tool choice strategies, parallel tool calling, and tool message handling

langchain-notes

16
from diegosouzapw/awesome-omni-skill

LangChain 框架学习笔记 - 快速查找概念、代码示例和最佳实践。包含 Core components、Middleware、Advanced usage、Multi-agent patterns、RAG retrieval、Long-term memory 等主题。当用户询问 LangChain、Agent、RAG、向量存储、工具使用、记忆系统时使用此 Skill。

langchain-js

16
from diegosouzapw/awesome-omni-skill

Builds LLM-powered applications with LangChain.js for chat, agents, and RAG. Use when creating AI applications with chains, memory, tools, and retrieval-augmented generation in JavaScript.

langchain-agents

16
from diegosouzapw/awesome-omni-skill

Expert guidance for building LangChain agents with proper tool binding, memory, and configuration. Use when creating agents, configuring models, or setting up tool integrations in LangConfig.

lang-python

16
from diegosouzapw/awesome-omni-skill

Python 3.13+ development specialist covering FastAPI, Django, async patterns, data science, testing with pytest, and modern Python features. Use when developing Python APIs, web applications, data pipelines, or writing tests.

kramme:agents-md

16
from diegosouzapw/awesome-omni-skill

This skill should be used when the user asks to "update AGENTS.md", "add to AGENTS.md", "maintain agent docs", or needs to add guidelines to agent instructions. Guides discovery of local skills and enforces structured, keyword-based documentation style.