llm-security

Security guidelines for LLM applications based on OWASP Top 10 for LLM 2025. Use when building LLM apps, reviewing AI security, implementing RAG systems, or asking about LLM vulnerabilities like 'prompt injection' or 'check LLM security'. IMPORTANT: Always consult this skill when building chatbots, AI agents, RAG pipelines, tool-using LLMs, agentic systems, or any application that calls an LLM API (OpenAI, Anthropic, Gemini, etc.) — even if the user doesn't explicitly mention security. Also use when users import 'openai', 'anthropic', 'langchain', 'llamaindex', or similar LLM libraries.

662 stars

Best use case

llm-security is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Security guidelines for LLM applications based on OWASP Top 10 for LLM 2025. Use when building LLM apps, reviewing AI security, implementing RAG systems, or asking about LLM vulnerabilities like 'prompt injection' or 'check LLM security'. IMPORTANT: Always consult this skill when building chatbots, AI agents, RAG pipelines, tool-using LLMs, agentic systems, or any application that calls an LLM API (OpenAI, Anthropic, Gemini, etc.) — even if the user doesn't explicitly mention security. Also use when users import 'openai', 'anthropic', 'langchain', 'llamaindex', or similar LLM libraries.

Teams using llm-security should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/llm-security/SKILL.md --create-dirs "https://raw.githubusercontent.com/wimpysworld/nix-config/main/home-manager/_mixins/development/assistants/skills/llm-security/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/llm-security/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How llm-security Compares

Feature / Agentllm-securityStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Security guidelines for LLM applications based on OWASP Top 10 for LLM 2025. Use when building LLM apps, reviewing AI security, implementing RAG systems, or asking about LLM vulnerabilities like 'prompt injection' or 'check LLM security'. IMPORTANT: Always consult this skill when building chatbots, AI agents, RAG pipelines, tool-using LLMs, agentic systems, or any application that calls an LLM API (OpenAI, Anthropic, Gemini, etc.) — even if the user doesn't explicitly mention security. Also use when users import 'openai', 'anthropic', 'langchain', 'llamaindex', or similar LLM libraries.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# LLM Security Guidelines (OWASP Top 10 for LLM 2025)

Security rules for building secure LLM applications, based on the OWASP Top 10 for LLM Applications 2025.

## How to Use This Skill

**Proactive mode** — When building or reviewing LLM applications, automatically check for relevant security risks based on the application pattern. You don't need to wait for the user to ask about LLM security.

**Reactive mode** — When the user asks about LLM security, use the mapping below to find relevant rule files with detailed vulnerable/secure code examples.

### Workflow
1. Identify what the user is building (see "What Are You Building?" below)
2. Check the priority rules for that pattern
3. Read the specific rule files from `rules/` for code examples
4. Apply the secure patterns or flag vulnerable ones

## What Are You Building?

Use this to quickly identify which rules matter most for the user's task:

| Building... | Priority Rules |
|-------------|---------------|
| **Chatbot / conversational AI** | Prompt Injection (LLM01), System Prompt Leakage (LLM07), Output Handling (LLM05), Unbounded Consumption (LLM10) |
| **RAG system** | Vector/Embedding Weaknesses (LLM08), Prompt Injection (LLM01), Sensitive Disclosure (LLM02), Misinformation (LLM09) |
| **AI agent with tools** | Excessive Agency (LLM06), Prompt Injection (LLM01), Output Handling (LLM05), Sensitive Disclosure (LLM02) |
| **Fine-tuning / training** | Data Poisoning (LLM04), Supply Chain (LLM03), Sensitive Disclosure (LLM02) |
| **LLM-powered API** | Unbounded Consumption (LLM10), Prompt Injection (LLM01), Output Handling (LLM05), Sensitive Disclosure (LLM02) |
| **Content generation** | Misinformation (LLM09), Output Handling (LLM05), Prompt Injection (LLM01) |

## Categories

### Critical Impact
- **LLM01: Prompt Injection** (`rules/prompt-injection.md`) - Prevent direct and indirect prompt manipulation
- **LLM02: Sensitive Information Disclosure** (`rules/sensitive-disclosure.md`) - Protect PII, credentials, and proprietary data
- **LLM03: Supply Chain** (`rules/supply-chain.md`) - Secure model sources, training data, and dependencies
- **LLM04: Data and Model Poisoning** (`rules/data-poisoning.md`) - Prevent training data manipulation and backdoors
- **LLM05: Improper Output Handling** (`rules/output-handling.md`) - Sanitise LLM outputs before downstream use

### High Impact
- **LLM06: Excessive Agency** (`rules/excessive-agency.md`) - Limit LLM permissions, functionality, and autonomy
- **LLM07: System Prompt Leakage** (`rules/system-prompt-leakage.md`) - Protect system prompts from disclosure
- **LLM08: Vector and Embedding Weaknesses** (`rules/vector-embedding.md`) - Secure RAG systems and embeddings
- **LLM09: Misinformation** (`rules/misinformation.md`) - Mitigate hallucinations and false outputs
- **LLM10: Unbounded Consumption** (`rules/unbounded-consumption.md`) - Prevent DoS, cost attacks, and model theft

See `rules/_sections.md` for the full index with OWASP/MITRE references.

## Quick Reference

| Vulnerability | Key Prevention |
|--------------|----------------|
| Prompt Injection | Input validation, output filtering, privilege separation |
| Sensitive Disclosure | Data sanitisation, access controls, encryption |
| Supply Chain | Verify models, SBOM, trusted sources only |
| Data Poisoning | Data validation, anomaly detection, sandboxing |
| Output Handling | Treat LLM as untrusted, encode outputs, parameterise queries |
| Excessive Agency | Least privilege, human-in-the-loop, minimise extensions |
| System Prompt Leakage | No secrets in prompts, external guardrails |
| Vector/Embedding | Access controls, data validation, monitoring |
| Misinformation | RAG, fine-tuning, human oversight, cross-verification |
| Unbounded Consumption | Rate limiting, input validation, resource monitoring |

## Key Principles

1. **Never trust LLM output** - Validate and sanitise all outputs before use
2. **Least privilege** - Grant minimum necessary permissions to LLM systems
3. **Defence in depth** - Layer multiple security controls
4. **Human oversight** - Require approval for high-impact actions
5. **Monitor and log** - Track all LLM interactions for anomaly detection

## References

- [OWASP Top 10 for LLM Applications 2025](https://genai.owasp.org/llm-top-10/)
- [MITRE ATLAS - Adversarial Threat Landscape for AI Systems](https://atlas.mitre.org/)
- [NIST AI Risk Management Framework](https://www.nist.gov/itl/ai-risk-management-framework)

Related Skills

code-security

662
from wimpysworld/nix-config

Security guidelines for writing secure code. Use when writing code, reviewing code for vulnerabilities, or asking about secure coding practices like 'check for SQL injection' or 'review security'. IMPORTANT: Always consult this skill when writing or reviewing any code that handles user input, authentication, file operations, database queries, network requests, cryptography, or infrastructure configuration (Terraform, Kubernetes, Docker, GitHub Actions) — even if the user doesn't explicitly mention security. Also use when users ask to 'review my code', 'check this for bugs', or 'is this safe'.

writing-clearly-and-concisely

662
from wimpysworld/nix-config

Core writing rules for clear, concise prose. Load when writing any text a human will read.

gh

662
from wimpysworld/nix-config

Load when executing GitHub tasks via the gh CLI: creating or reviewing pull requests, managing issues, checking CI runs, creating releases, searching GitHub, or making raw GitHub API calls.

meet-the-agents

662
from wimpysworld/nix-config

Registry of available specialist agents and their task domains. Load when delegating a task, selecting an agent, or unsure which agent to use.

semgrep

662
from wimpysworld/nix-config

Run Semgrep static analysis scans and create custom detection rules. Use when asked to scan code with Semgrep, find security vulnerabilities, write custom YAML rules, or detect specific bug patterns. IMPORTANT: Also use this skill when users ask to 'scan for bugs', 'check code quality', 'find vulnerabilities', 'static analysis', 'lint for security', 'audit this code', or want to enforce coding standards — even if they don't mention Semgrep by name. Semgrep is the right tool for pattern-based code scanning across 30+ languages.

prose-style-reference

662
from wimpysworld/nix-config

Extended writing reference for documentation and content creation. Load for blog posts, READMEs, technical guides, and long-form writing.

perl-security

144923
from affaan-m/everything-claude-code

全面的Perl安全指南,涵盖污染模式、输入验证、安全进程执行、DBI参数化查询、Web安全(XSS/SQLi/CSRF)以及perlcritic安全策略。

SecurityClaude

laravel-security

144923
from affaan-m/everything-claude-code

Laravel security best practices for authn/authz, validation, CSRF, mass assignment, file uploads, secrets, rate limiting, and secure deployment.

DevelopmentClaude

springboot-security

144923
from affaan-m/everything-claude-code

Spring Security best practices for authn/authz, validation, CSRF, secrets, headers, rate limiting, and dependency security in Java Spring Boot services.

DevelopmentClaude

security-scan

144923
from affaan-m/everything-claude-code

AgentShield を使用して、Claude Code の設定(.claude/ ディレクトリ)のセキュリティ脆弱性、設定ミス、インジェクションリスクをスキャンします。CLAUDE.md、settings.json、MCP サーバー、フック、エージェント定義をチェックします。

SecurityClaude

django-security

144923
from affaan-m/everything-claude-code

Django security best practices, authentication, authorization, CSRF protection, SQL injection prevention, XSS prevention, and secure deployment configurations.

DevelopmentClaude

security-review

144923
from affaan-m/everything-claude-code

Use this skill when adding authentication, handling user input, working with secrets, creating API endpoints, or implementing payment/sensitive features. Provides comprehensive security checklist and patterns.

SecurityClaude