performing-yara-rule-development-for-detection

Develop precise YARA rules for malware detection by identifying unique byte patterns, strings, and behavioral indicators in executable files while minimizing false positives.

4,032 stars

bymukul975

View on GitHub Installation ↓

Best use case

performing-yara-rule-development-for-detection is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Develop precise YARA rules for malware detection by identifying unique byte patterns, strings, and behavioral indicators in executable files while minimizing false positives.

Teams using performing-yara-rule-development-for-detection should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/performing-yara-rule-development-for-detection/SKILL.md --create-dirs "https://raw.githubusercontent.com/mukul975/Anthropic-Cybersecurity-Skills/main/skills/performing-yara-rule-development-for-detection/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/performing-yara-rule-development-for-detection/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How performing-yara-rule-development-for-detection Compares

Feature / Agent	performing-yara-rule-development-for-detection	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Develop precise YARA rules for malware detection by identifying unique byte patterns, strings, and behavioral indicators in executable files while minimizing false positives.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

SKILL.md Source

# Performing YARA Rule Development for Detection

## Overview

YARA is the pattern matching swiss knife for malware researchers, enabling identification and classification of malware based on textual or binary patterns. Effective YARA rules combine unique string patterns, byte sequences, PE header characteristics, import table analysis, and conditional logic to detect malware families while avoiding false positives. Modern YARA-X (rewritten in Rust, stable since June 2025) brings improved performance and new modules. Rules should target unpacked malware artifacts like hardcoded stack strings, C2 URLs, mutex names, encryption constants, and unique code sequences rather than packer signatures.


## When to Use

- When conducting security assessments that involve performing yara rule development for detection
- When following incident response procedures for related security events
- When performing scheduled security testing or auditing activities
- When validating security controls through hands-on testing

## Prerequisites

- Python 3.9+ with `yara-python` library
- YARA 4.5+ or YARA-X 0.10+
- PE analysis tools (`pefile`, `pestudio`)
- Hex editor for identifying unique byte patterns
- Access to malware samples (VirusTotal, MalwareBazaar)
- Understanding of PE file format, strings, and import tables

## Key Concepts

### Rule Structure

Every YARA rule consists of three sections: `meta` (optional descriptive metadata), `strings` (pattern definitions), and `condition` (matching logic). String types include text strings (ASCII/wide/nocase), hex patterns with wildcards and jumps, and regular expressions. Conditions combine string matches with file properties using boolean operators.

### String Selection Strategy

Effective rules target patterns that are unique to the malware family and survive recompilation. Hardcoded stack strings are excellent choices because compilers embed them consistently. C2 domain patterns, custom encryption routines, unique error messages, and specific API call sequences provide stable detection anchors. Avoid compiler-generated boilerplate and common library strings.

### Performance Optimization

YARA evaluates conditions short-circuit style. Place the most discriminating and cheapest-to-evaluate conditions first. Use `filesize` limits to skip irrelevant files quickly. Minimize regex usage in favor of hex patterns. Use `private` rules as building blocks for complex detection logic without generating standalone matches.

## Workflow

### Step 1: Analyze Sample for Unique Patterns

```python
#!/usr/bin/env python3
"""Extract candidate strings and byte patterns for YARA rule creation."""
import pefile
import re
import sys
from collections import Counter


def extract_strings(filepath, min_length=6):
    """Extract ASCII and wide strings from binary."""
    with open(filepath, 'rb') as f:
        data = f.read()

    # ASCII strings
    ascii_strings = re.findall(
        rb'[\x20-\x7e]{' + str(min_length).encode() + rb',}', data
    )

    # Wide (UTF-16LE) strings
    wide_strings = re.findall(
        rb'(?:[\x20-\x7e]\x00){' + str(min_length).encode() + rb',}', data
    )

    return {
        'ascii': [s.decode('ascii') for s in ascii_strings],
        'wide': [s.decode('utf-16-le') for s in wide_strings],
    }


def analyze_pe_imports(filepath):
    """Extract import table for API-based detection."""
    try:
        pe = pefile.PE(filepath)
    except pefile.PEFormatError:
        return []

    imports = []
    if hasattr(pe, 'DIRECTORY_ENTRY_IMPORT'):
        for entry in pe.DIRECTORY_ENTRY_IMPORT:
            dll_name = entry.dll.decode('utf-8', errors='replace')
            for imp in entry.imports:
                if imp.name:
                    func_name = imp.name.decode('utf-8', errors='replace')
                    imports.append(f"{dll_name}!{func_name}")
    return imports


def find_unique_byte_patterns(filepath, pattern_length=16):
    """Find unique byte sequences suitable for YARA hex patterns."""
    with open(filepath, 'rb') as f:
        data = f.read()

    try:
        pe = pefile.PE(filepath)
        # Focus on code section
        for section in pe.sections:
            if section.Characteristics & 0x20000000:  # IMAGE_SCN_MEM_EXECUTE
                code_start = section.PointerToRawData
                code_end = code_start + section.SizeOfRawData
                code_data = data[code_start:code_end]
                break
        else:
            code_data = data
    except Exception:
        code_data = data

    # Find byte patterns that appear exactly once
    patterns = []
    for i in range(0, len(code_data) - pattern_length, 4):
        pattern = code_data[i:i+pattern_length]
        if pattern.count(b'\x00') < pattern_length // 3:  # Skip null-heavy
            hex_pattern = ' '.join(f'{b:02X}' for b in pattern)
            patterns.append(hex_pattern)

    # Count frequency and return unique ones
    freq = Counter(patterns)
    unique = [p for p, count in freq.items() if count == 1]

    return unique[:20]  # Top 20 candidates


def suggest_rule_strings(filepath):
    """Suggest strings and patterns for YARA rule."""
    print(f"[+] Analyzing: {filepath}")

    # Extract strings
    strings = extract_strings(filepath)

    # Filter for suspicious/unique strings
    suspicious_keywords = [
        'http', 'https', 'cmd', 'powershell', 'mutex', 'pipe',
        'password', 'credential', 'inject', 'hook', 'debug',
        'sandbox', 'virtual', 'vmware', 'vbox',
    ]

    print("\n[+] Suspicious ASCII strings:")
    for s in strings['ascii']:
        if any(kw in s.lower() for kw in suspicious_keywords):
            print(f"  $ = \"{s}\" ascii")

    print("\n[+] Suspicious wide strings:")
    for s in strings['wide']:
        if any(kw in s.lower() for kw in suspicious_keywords):
            print(f"  $ = \"{s}\" wide")

    # Import analysis
    imports = analyze_pe_imports(filepath)
    suspicious_apis = [
        'VirtualAlloc', 'VirtualProtect', 'WriteProcessMemory',
        'CreateRemoteThread', 'NtUnmapViewOfSection', 'RtlMoveMemory',
        'OpenProcess', 'CreateToolhelp32Snapshot',
        'InternetOpenA', 'HttpSendRequestA',
        'CryptEncrypt', 'CryptDecrypt',
    ]

    print("\n[+] Suspicious imports:")
    for imp in imports:
        func = imp.split('!')[-1]
        if func in suspicious_apis:
            print(f"  {imp}")

    # Byte patterns
    print("\n[+] Candidate hex patterns:")
    patterns = find_unique_byte_patterns(filepath)
    for p in patterns[:5]:
        print(f"  $hex = {{ {p} }}")


if __name__ == "__main__":
    if len(sys.argv) < 2:
        print(f"Usage: {sys.argv[0]} <sample_path>")
        sys.exit(1)
    suggest_rule_strings(sys.argv[1])
```

### Step 2: Write and Test YARA Rules

```python
import yara
import os

def create_yara_rule(rule_name, meta, strings, condition):
    """Generate a YARA rule from components."""
    meta_str = "\n".join(f'        {k} = "{v}"' for k, v in meta.items())
    strings_str = "\n".join(f"        {s}" for s in strings)

    rule = f"""rule {rule_name} {{
    meta:
{meta_str}

    strings:
{strings_str}

    condition:
        {condition}
}}"""
    return rule


def test_yara_rule(rule_text, test_dir):
    """Compile and test YARA rule against sample directory."""
    try:
        rules = yara.compile(source=rule_text)
    except yara.SyntaxError as e:
        print(f"[-] YARA syntax error: {e}")
        return None

    results = {"matches": [], "no_match": []}

    for filename in os.listdir(test_dir):
        filepath = os.path.join(test_dir, filename)
        if not os.path.isfile(filepath):
            continue

        matches = rules.match(filepath)
        if matches:
            results["matches"].append({
                "file": filename,
                "rules": [m.rule for m in matches],
            })
        else:
            results["no_match"].append(filename)

    print(f"[+] Matches: {len(results['matches'])}")
    print(f"[-] No match: {len(results['no_match'])}")
    return results


# Example: Create a rule for a hypothetical malware family
example_rule = create_yara_rule(
    rule_name="MalwareFamily_Variant_A",
    meta={
        "description": "Detects MalwareFamily Variant A",
        "author": "Malware Analysis Team",
        "date": "2025-01-01",
        "hash": "abc123...",
        "tlp": "WHITE",
    },
    strings=[
        '$mutex = "Global\\\\UniqueM4lwareMutex" ascii wide',
        '$c2_pattern = /https?:\\/\\/[a-z]{5,10}\\.(xyz|top|buzz)\\/gate\\.php/',
        '$api1 = "VirtualAllocEx" ascii',
        '$api2 = "WriteProcessMemory" ascii',
        '$api3 = "CreateRemoteThread" ascii',
        '$hex_decrypt = { 8B 45 ?? 33 C1 89 45 ?? 83 C1 04 }',
        '$pdb = "C:\\\\Users\\\\" ascii',
    ],
    condition=(
        'uint16(0) == 0x5A4D and filesize < 2MB and '
        '($mutex or $c2_pattern) and '
        '2 of ($api*) and '
        '$hex_decrypt'
    ),
)

print(example_rule)
```

### Step 3: Performance Testing and Optimization

```python
import time

def benchmark_rule(rule_text, scan_directory, iterations=3):
    """Benchmark YARA rule scan performance."""
    rules = yara.compile(source=rule_text)

    files = []
    for root, _, filenames in os.walk(scan_directory):
        for f in filenames:
            files.append(os.path.join(root, f))

    print(f"[+] Benchmarking against {len(files)} files "
          f"({iterations} iterations)")

    times = []
    for i in range(iterations):
        start = time.perf_counter()
        matches = 0
        for filepath in files:
            try:
                result = rules.match(filepath)
                if result:
                    matches += 1
            except Exception:
                pass
        elapsed = time.perf_counter() - start
        times.append(elapsed)
        print(f"  Iteration {i+1}: {elapsed:.3f}s ({matches} matches)")

    avg_time = sum(times) / len(times)
    files_per_sec = len(files) / avg_time
    print(f"\n[+] Average: {avg_time:.3f}s ({files_per_sec:.0f} files/sec)")
    return avg_time
```

## Validation Criteria

- YARA rules compile without syntax errors
- Rules detect target malware family samples with zero false negatives
- False positive rate below 0.1% when scanned against clean file corpus
- Rule performance allows scanning 1000+ files per second
- Rules survive minor malware modifications (recompilation, string changes)
- Metadata includes hash, author, date, description, and TLP marking

## References

- [YARA Official Documentation](https://virustotal.github.io/yara/)
- [YARA-X Rewrite in Rust](https://github.com/VirusTotal/yara-x)
- [Yara-Rules Community Repository](https://github.com/Yara-Rules/rules)
- [ReversingLabs - Writing Detailed YARA Rules](https://www.reversinglabs.com/blog/writing-detailed-yara-rules-for-malware-detection)
- [YARA Rule Crafting Deep Dive](https://cyberthreatintelligencenetwork.com/index.php/2024/09/11/yara-rule-crafting-a-deep-dive-into-signature-based-threat-hunting-strategies/)

Related Skills

performing-wireless-security-assessment-with-kismet

4032

from mukul975/Anthropic-Cybersecurity-Skills

Conduct wireless network security assessments using Kismet to detect rogue access points, hidden SSIDs, weak encryption, and unauthorized clients through passive RF monitoring.

performing-wireless-network-penetration-test

4032

from mukul975/Anthropic-Cybersecurity-Skills

Execute a wireless network penetration test to assess WiFi security by capturing handshakes, cracking WPA2/WPA3 keys, detecting rogue access points, and testing wireless segmentation using Aircrack-ng and related tools.

performing-windows-artifact-analysis-with-eric-zimmerman-tools

4032

from mukul975/Anthropic-Cybersecurity-Skills

Perform comprehensive Windows forensic artifact analysis using Eric Zimmerman's open-source EZ Tools suite including KAPE, MFTECmd, PECmd, LECmd, JLECmd, and Timeline Explorer for parsing registry hives, prefetch files, event logs, and file system metadata.

performing-wifi-password-cracking-with-aircrack

4032

from mukul975/Anthropic-Cybersecurity-Skills

Captures WPA/WPA2 handshakes and performs offline password cracking using aircrack-ng, hashcat, and dictionary attacks during authorized wireless security assessments to evaluate passphrase strength and wireless network security posture.

performing-web-cache-poisoning-attack

4032

from mukul975/Anthropic-Cybersecurity-Skills

Exploiting web cache mechanisms to serve malicious content to other users by poisoning cached responses through unkeyed headers and parameters during authorized security tests.

performing-web-cache-deception-attack

4032

from mukul975/Anthropic-Cybersecurity-Skills

Execute web cache deception attacks by exploiting path normalization discrepancies between CDN caching layers and origin servers to cache and retrieve sensitive authenticated content.

performing-web-application-vulnerability-triage

4032

from mukul975/Anthropic-Cybersecurity-Skills

Triage web application vulnerability findings from DAST/SAST scanners using OWASP risk rating methodology to separate true positives from false positives and prioritize remediation.

performing-web-application-scanning-with-nikto

4032

from mukul975/Anthropic-Cybersecurity-Skills

Nikto is an open-source web server and web application scanner that tests against over 7,000 potentially dangerous files/programs, checks for outdated versions of over 1,250 servers, and identifies ve

performing-web-application-penetration-test

4032

from mukul975/Anthropic-Cybersecurity-Skills

Performs systematic security testing of web applications following the OWASP Web Security Testing Guide (WSTG) methodology to identify vulnerabilities in authentication, authorization, input validation, session management, and business logic. The tester uses Burp Suite as the primary interception proxy alongside manual testing techniques to find flaws that automated scanners miss. Activates for requests involving web app pentest, OWASP testing, application security assessment, or web vulnerability testing.

performing-web-application-firewall-bypass

4032

from mukul975/Anthropic-Cybersecurity-Skills

Bypass Web Application Firewall protections using encoding techniques, HTTP method manipulation, parameter pollution, and payload obfuscation to deliver SQL injection, XSS, and other attack payloads past WAF detection rules.

performing-vulnerability-scanning-with-nessus

4032

from mukul975/Anthropic-Cybersecurity-Skills

Performs authenticated and unauthenticated vulnerability scanning using Tenable Nessus to identify known vulnerabilities, misconfigurations, default credentials, and missing patches across network infrastructure, servers, and applications. The scanner correlates findings with CVE databases and CVSS scores to produce prioritized remediation guidance. Activates for requests involving vulnerability scanning, Nessus assessment, patch compliance checking, or automated vulnerability detection.

performing-vlan-hopping-attack

4032

from mukul975/Anthropic-Cybersecurity-Skills

Simulates VLAN hopping attacks using switch spoofing and double tagging techniques in authorized environments to test VLAN segmentation effectiveness and validate switch port security configurations against Layer 2 bypass attacks.