prompt-inspector

Detect prompt injection attacks and adversarial inputs in user text before passing it to your LLM. Use when you need to validate or screen user-provided text for jailbreak attempts, instruction overrides, role-play escapes, or other prompt manipulation techniques. Returns a safety verdict, risk score (0–1), and threat categories. Ideal for guarding AI pipelines, chatbots, and any application that feeds user input into a language model.

3,891 stars

Best use case

prompt-inspector is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Detect prompt injection attacks and adversarial inputs in user text before passing it to your LLM. Use when you need to validate or screen user-provided text for jailbreak attempts, instruction overrides, role-play escapes, or other prompt manipulation techniques. Returns a safety verdict, risk score (0–1), and threat categories. Ideal for guarding AI pipelines, chatbots, and any application that feeds user input into a language model.

Teams using prompt-inspector should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/prompt-inspector/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/aunicall/prompt-inspector/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/prompt-inspector/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How prompt-inspector Compares

Feature / Agentprompt-inspectorStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Detect prompt injection attacks and adversarial inputs in user text before passing it to your LLM. Use when you need to validate or screen user-provided text for jailbreak attempts, instruction overrides, role-play escapes, or other prompt manipulation techniques. Returns a safety verdict, risk score (0–1), and threat categories. Ideal for guarding AI pipelines, chatbots, and any application that feeds user input into a language model.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Prompt Inspector

**Prompt Inspector** is a production-grade API service that detects prompt injection attacks, jailbreak attempts, and adversarial manipulations in real time.

📖 **For detailed product information, features, and threat categories, see [references/product-info.md](./references/product-info.md)**

---

## Requirements

Provide your API key via either:

- Environment variable: `PMTINSP_API_KEY=your-api-key`, or
- `~/.openclaw/.env` line: `PMTINSP_API_KEY=your-api-key`

Get your API key at [promptinspector.io](https://promptinspector.io) by creating an app.

Manage custom sensitive words in your dashboard at [promptinspector.io](https://promptinspector.io).

---

## Commands

### Detect a single text (Python)

```bash
# Basic detection — prints verdict and score
python3 {baseDir}/scripts/detect.py --text "..."

# JSON output
python3 {baseDir}/scripts/detect.py --text "..." --format json

# Override API key inline
python3 {baseDir}/scripts/detect.py --api-key pi_xxx --text "..."
```

### Detect a single text (Node.js)

```bash
# Basic detection
node {baseDir}/scripts/detect.js --text "..."

# JSON output
node {baseDir}/scripts/detect.js --text "..." --format json

# Override API key inline
node {baseDir}/scripts/detect.js --api-key pi_xxx --text "..."
```

### Batch detection from a file (Python)

```bash
# Each line in the file is treated as one text to inspect
python3 {baseDir}/scripts/detect.py --file inputs.txt

# JSON output for automation
python3 {baseDir}/scripts/detect.py --file inputs.txt --format json
```

---

## Output

### Default (human-readable)

```
Request ID : a1b2c3d4-...
Is Safe    : False
Score      : 0.97
Category   : prompt_injection, jailbreak
Latency    : 34 ms
```

### JSON (`--format json`)

```json
{
  "request_id": "a1b2c3d4-...",
  "is_safe": false,
  "score": 0.97,
  "category": ["prompt_injection", "jailbreak"],
  "latency_ms": 34
}
```

---

## Threat Categories

Prompt Inspector detects **10 threat categories**:
- instruction_override
- asset_extraction
- syntax_injection
- jailbreak
- response_forcing
- euphemism_bypass
- reconnaissance_probe
- parameter_injection
- encoded_payload
- custom_sensitive_word

📖 **For complete category descriptions, see [references/product-info.md](./references/product-info.md#threat-categories)**

---

## API at a Glance

```
POST /api/v1/detect/sdk
Header: X-App-Key: <your-api-key>
Body:   {"input_text": "<text to inspect>"}
```

**Response:**

```json
{
  "request_id": "string",
  "latency_ms": 34,
  "result": {
    "is_safe": false,
    "score": 0.97,
    "category": ["prompt_injection"]
  }
}
```

Full API reference: [docs.promptinspector.io](https://docs.promptinspector.io)

---

## Notes

- Keep text under the limit for your plan tier. Very long inputs may be rejected with HTTP 413.
- Use `--format json` when piping output to other tools.
- For bulk workloads, batch requests with `--file` to minimise round-trip overhead.
- Contact [hello@promptinspector.io](mailto:hello@promptinspector.io) for enterprise plans and self-hosting support.

Related Skills

prompt-injection-defense

3891
from openclaw/skills

Harden agent sessions against prompt injection from untrusted content. Use when the agent reads web search results, emails, downloaded files, PDFs, or any external text that could contain adversarial instructions. Provides content scanning, memory write guardrails (scan → lint → accept or quarantine), untrusted content tagging, and canary detection. Also use when setting up new tools that ingest external content (email checkers, RSS readers, web scrapers).

CinePrompt Skill

3891
from openclaw/skills

AI video prompt builder for cinematographers. Translates natural language shot descriptions into structured prompts optimized for AI video generators.

prompt-agent

3891
from openclaw/skills

将中文创意需求转换为 SDXL 或 Flux 可用的高质量英文图像提示词。当用户要求生成图片、画一张图、出图、AI绘画时触发。

reprompter

3891
from openclaw/skills

Transform messy prompts into well-structured, effective prompts — single or multi-agent. Use when: "reprompt", "reprompt this", "clean up this prompt", "structure my prompt", rough text needing XML tags and best practices, "reprompter teams", "repromptception", "run with quality", "smart run", "smart agents", multi-agent tasks, audits, parallel work, anything going to agent teams. Don't use when: simple Q&A, pure chat, immediate execution-only tasks. See "Don't Use When" section for details. Outputs: Structured XML/Markdown prompt, quality score (before/after), optional team brief + per-agent sub-prompts, agent team output files. Success criteria: Single mode quality score ≥ 7/10; Repromptception per-agent prompt quality score 8+/10; all required sections present, actionable and specific.

indirect-prompt-injection

3891
from openclaw/skills

Detect and reject indirect prompt injection attacks when reading external content (social media posts, comments, documents, emails, web pages, user uploads). Use this skill BEFORE processing any untrusted external content to identify manipulation attempts that hijack goals, exfiltrate data, override instructions, or social engineer compliance. Includes 20+ detection patterns, homoglyph detection, and sanitization scripts.

ai-video-prompt

3891
from openclaw/skills

AI视频Prompt构建专家。采用"首尾帧图片+视频"工作流,支持多段5秒视频拼接生成长视频(30秒/60秒)。先生成关键帧图片,再生成视频Prompt,确保段与段之间无缝衔接。针对即梦平台优化,支持全中文Prompt输出。

prompt-nubaby

3891
from openclaw/skills

Nubaby prompt system for prompt augmentation, routers, dictionaries, dataset captions, prompt tags, compact prompts, video/storyboard prompt shaping, and structured visual tension expansion. Use when prompts are too short/vague or need structured upgrade before comfyui-nubaby execution.

senior-prompt-engineer

3891
from openclaw/skills

This skill should be used when the user asks to "optimize prompts", "design prompt templates", "evaluate LLM outputs", "build agentic systems", "implement RAG", "create few-shot examples", "analyze token usage", or "design AI workflows". Use for prompt engineering patterns, LLM evaluation frameworks, agent architectures, and structured output design.

prompt-engineer-toolkit

3891
from openclaw/skills

Analyzes and rewrites prompts for better AI output, creates reusable prompt templates for marketing use cases (ad copy, email campaigns, social media), and structures end-to-end AI content workflows. Use when the user wants to improve prompts for AI-assisted marketing, build prompt templates, or optimize AI content workflows. Also use when the user mentions 'prompt engineering,' 'improve my prompts,' 'AI writing quality,' 'prompt templates,' or 'AI content workflow.'

prompt-assemble

3891
from openclaw/skills

Token-safe prompt assembly with memory orchestration. Use for any agent that needs to construct LLM prompts with memory retrieval. Guarantees no API failure due to token overflow. Implements two-phase context construction, memory safety valve, and hard limits on memory injection.

journal-cover-prompter

3891
from openclaw/skills

Use when creating journal cover images, generating scientific artwork prompts, or designing graphical abstracts. Creates detailed prompts for AI image generators to produce publication-quality scientific visuals.

no-prompt

3891
from openclaw/skills

Stop learning prompt engineering. Tell AI what you want in plain language — AI writes the perfect instruction for you in I-Lang. Copy it to any other AI, it executes perfectly. Zero prompt skills needed. Text-to-text translator only, no code, no install, no credentials.