when-profiling-performance-use-performance-profiler

Comprehensive performance profiling, bottleneck detection, and optimization system

25 stars

Best use case

when-profiling-performance-use-performance-profiler is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Comprehensive performance profiling, bottleneck detection, and optimization system

Teams using when-profiling-performance-use-performance-profiler should expect a more consistent output, faster repeated execution, less prompt rewriting, better workflow continuity with your supporting tools.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.
  • You already have the supporting tools or dependencies needed by this skill.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/when-profiling-performance-use-performance-profiler/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/aiskillstore/marketplace/dnyoussef/when-profiling-performance-use-performance-profiler/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/when-profiling-performance-use-performance-profiler/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How when-profiling-performance-use-performance-profiler Compares

Feature / Agentwhen-profiling-performance-use-performance-profilerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Comprehensive performance profiling, bottleneck detection, and optimization system

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Performance Profiler Skill

## Overview

**When profiling performance, use performance-profiler** to measure, analyze, and optimize application performance across CPU, memory, I/O, and network dimensions.

## MECE Breakdown

### Mutually Exclusive Components:
1. **Baseline Phase**: Establish current performance metrics
2. **Detection Phase**: Identify bottlenecks and hot paths
3. **Analysis Phase**: Root cause analysis and impact assessment
4. **Optimization Phase**: Generate and prioritize recommendations
5. **Implementation Phase**: Apply optimizations with agent assistance
6. **Validation Phase**: Benchmark improvements and verify gains

### Collectively Exhaustive Coverage:
- **CPU Profiling**: Function execution time, hot paths, call graphs
- **Memory Profiling**: Heap usage, allocations, leaks, garbage collection
- **I/O Profiling**: File system, database, network latency
- **Network Profiling**: Request timing, bandwidth, connection pooling
- **Concurrency**: Thread utilization, lock contention, async operations
- **Algorithm Analysis**: Time complexity, space complexity
- **Cache Analysis**: Hit rates, cache misses, invalidation patterns
- **Database**: Query performance, N+1 problems, index usage

## Features

### Core Capabilities:
- Multi-dimensional performance profiling (CPU, memory, I/O, network)
- Automated bottleneck detection with prioritization
- Real-time profiling and historical analysis
- Flame graph generation for visual analysis
- Memory leak detection and heap snapshots
- Database query optimization
- Algorithmic complexity analysis
- A/B comparison of before/after optimizations
- Production-safe profiling with minimal overhead
- Integration with APM tools (New Relic, DataDog, etc.)

### Profiling Modes:
- **Quick Scan**: 30-second lightweight profiling
- **Standard**: 5-minute comprehensive analysis
- **Deep**: 30-minute detailed investigation
- **Continuous**: Long-running production monitoring
- **Stress Test**: Load-based profiling under high traffic

## Usage

### Slash Command:
```bash
/profile [path] [--mode quick|standard|deep] [--target cpu|memory|io|network|all]
```

### Subagent Invocation:
```javascript
Task("Performance Profiler", "Profile ./app with deep CPU and memory analysis", "performance-analyzer")
```

### MCP Tool:
```javascript
mcp__performance-profiler__analyze({
  project_path: "./app",
  profiling_mode: "standard",
  targets: ["cpu", "memory", "io"],
  generate_optimizations: true
})
```

## Architecture

### Phase 1: Baseline Measurement
1. Establish current performance metrics
2. Define performance budgets
3. Set up monitoring infrastructure
4. Capture baseline snapshots

### Phase 2: Bottleneck Detection
1. CPU profiling (sampling or instrumentation)
2. Memory profiling (heap analysis)
3. I/O profiling (syscall tracing)
4. Network profiling (packet analysis)
5. Database profiling (query logs)

### Phase 3: Root Cause Analysis
1. Correlate metrics across dimensions
2. Identify causal relationships
3. Calculate performance impact
4. Prioritize issues by severity

### Phase 4: Optimization Generation
1. Algorithmic improvements
2. Caching strategies
3. Parallelization opportunities
4. Database query optimization
5. Memory optimization
6. Network optimization

### Phase 5: Implementation
1. Generate optimized code with coder agent
2. Apply database optimizations
3. Configure caching layers
4. Implement parallelization

### Phase 6: Validation
1. Run benchmark suite
2. Compare before/after metrics
3. Verify no regressions
4. Generate performance report

## Output Formats

### Performance Report:
```json
{
  "project": "my-app",
  "profiling_mode": "standard",
  "duration_seconds": 300,
  "baseline": {
    "requests_per_second": 1247,
    "avg_response_time_ms": 123,
    "p95_response_time_ms": 456,
    "p99_response_time_ms": 789,
    "cpu_usage_percent": 67,
    "memory_usage_mb": 512,
    "error_rate_percent": 0.1
  },
  "bottlenecks": [
    {
      "type": "cpu",
      "severity": "high",
      "function": "processData",
      "time_percent": 34.5,
      "calls": 123456,
      "avg_time_ms": 2.3,
      "recommendation": "Optimize algorithm complexity from O(n²) to O(n log n)"
    }
  ],
  "optimizations": [...],
  "estimated_improvement": {
    "throughput_increase": "3.2x",
    "latency_reduction": "68%",
    "memory_reduction": "45%"
  }
}
```

### Flame Graph:
Interactive SVG flame graph showing call stack with time proportions

### Heap Snapshot:
Memory allocation breakdown with retention paths

### Optimization Report:
Prioritized list of actionable improvements with code examples

## Examples

### Example 1: Quick CPU Profiling
```bash
/profile ./my-app --mode quick --target cpu
```

### Example 2: Deep Memory Analysis
```bash
/profile ./my-app --mode deep --target memory --detect-leaks
```

### Example 3: Full Stack Optimization
```bash
/profile ./my-app --mode standard --target all --optimize --benchmark
```

### Example 4: Database Query Optimization
```bash
/profile ./my-app --mode standard --target io --database --explain-queries
```

## Integration with Claude-Flow

### Coordination Pattern:
```javascript
// Step 1: Initialize profiling swarm
mcp__claude-flow__swarm_init({ topology: "star", maxAgents: 5 })

// Step 2: Spawn specialized agents
[Parallel Execution]:
  Task("CPU Profiler", "Profile CPU usage and identify hot paths in ./app", "performance-analyzer")
  Task("Memory Profiler", "Analyze heap usage and detect memory leaks", "performance-analyzer")
  Task("I/O Profiler", "Profile file system and database operations", "performance-analyzer")
  Task("Network Profiler", "Analyze network requests and identify slow endpoints", "performance-analyzer")
  Task("Optimizer", "Generate optimization recommendations based on profiling data", "optimizer")

// Step 3: Implementation agent applies optimizations
[Sequential Execution]:
  Task("Coder", "Implement recommended optimizations from profiling analysis", "coder")
  Task("Benchmarker", "Run benchmark suite and validate improvements", "performance-benchmarker")
```

## Configuration

### Default Settings:
```json
{
  "profiling": {
    "sampling_rate_hz": 99,
    "stack_depth": 128,
    "include_native_code": false,
    "track_allocations": true
  },
  "thresholds": {
    "cpu_hot_path_percent": 10,
    "memory_leak_growth_mb": 10,
    "slow_query_ms": 100,
    "slow_request_ms": 1000
  },
  "optimization": {
    "auto_apply": false,
    "require_approval": true,
    "run_tests_before": true,
    "run_benchmarks_after": true
  },
  "output": {
    "flame_graph": true,
    "heap_snapshot": true,
    "call_tree": true,
    "recommendations": true
  }
}
```

## Profiling Techniques

### CPU Profiling:
- **Sampling**: Periodic stack sampling (low overhead)
- **Instrumentation**: Function entry/exit hooks (accurate but higher overhead)
- **Tracing**: Event-based profiling

### Memory Profiling:
- **Heap Snapshots**: Point-in-time memory state
- **Allocation Tracking**: Record all allocations
- **Leak Detection**: Compare snapshots over time
- **GC Analysis**: Garbage collection patterns

### I/O Profiling:
- **Syscall Tracing**: Track system calls (strace, dtrace)
- **File System**: Monitor read/write operations
- **Database**: Query logging and EXPLAIN ANALYZE
- **Network**: Packet capture and request timing

### Concurrency Profiling:
- **Thread Analysis**: CPU utilization per thread
- **Lock Contention**: Identify blocking operations
- **Async Operations**: Promise/callback timing

## Performance Optimization Strategies

### Algorithmic:
- Reduce time complexity (O(n²) → O(n log n))
- Use appropriate data structures
- Eliminate unnecessary work
- Memoization and dynamic programming

### Caching:
- In-memory caching (Redis, Memcached)
- CDN for static assets
- HTTP caching headers
- Query result caching

### Parallelization:
- Multi-threading
- Worker pools
- Async I/O
- Batching operations

### Database:
- Add missing indexes
- Optimize queries
- Reduce N+1 queries
- Connection pooling
- Read replicas

### Memory:
- Object pooling
- Reduce allocations
- Stream processing
- Compression

### Network:
- Connection keep-alive
- HTTP/2 or HTTP/3
- Compression
- Request batching
- Rate limiting

## Performance Budgets

### Frontend:
- Time to First Byte (TTFB): < 200ms
- First Contentful Paint (FCP): < 1.8s
- Largest Contentful Paint (LCP): < 2.5s
- Time to Interactive (TTI): < 3.8s
- Total Blocking Time (TBT): < 200ms
- Cumulative Layout Shift (CLS): < 0.1

### Backend:
- API Response Time (p50): < 100ms
- API Response Time (p95): < 500ms
- API Response Time (p99): < 1000ms
- Throughput: > 1000 req/s
- Error Rate: < 0.1%
- CPU Usage: < 70%
- Memory Usage: < 80%

### Database:
- Query Time (p50): < 10ms
- Query Time (p95): < 50ms
- Query Time (p99): < 100ms
- Connection Pool Utilization: < 80%

## Best Practices

1. Profile production workloads when possible
2. Use production-like data volumes
3. Profile under realistic load
4. Measure multiple times for consistency
5. Focus on p95/p99, not just averages
6. Optimize bottlenecks in order of impact
7. Always benchmark before and after
8. Monitor for regressions in CI/CD
9. Set up continuous profiling
10. Track performance over time

## Troubleshooting

### Issue: High CPU usage but no obvious hot path
**Solution**: Check for excessive small function calls, increase sampling rate, or use instrumentation

### Issue: Memory grows continuously
**Solution**: Run heap snapshot comparison to identify leak sources

### Issue: Slow database queries
**Solution**: Use EXPLAIN ANALYZE, check for missing indexes, analyze query plans

### Issue: High latency but low CPU
**Solution**: Profile I/O operations, check for blocking synchronous calls

## See Also

- PROCESS.md - Detailed step-by-step profiling workflow
- README.md - Quick start guide
- subagent-performance-profiler.md - Agent implementation details
- slash-command-profile.sh - Command-line interface
- mcp-performance-profiler.json - MCP tool schema

Related Skills

validating-performance-budgets

25
from ComeOnOliver/skillshub

Validate application performance against defined budgets to identify regressions early. Use when checking page load times, bundle sizes, or API response times against thresholds. Trigger with phrases like "validate performance budget", "check performance metrics", or "detect performance regression".

analyzing-query-performance

25
from ComeOnOliver/skillshub

This skill enables Claude to analyze and optimize database query performance. It activates when the user discusses query performance issues, provides an EXPLAIN plan, or asks for optimization recommendations. The skill leverages the query-performance-analyzer plugin to interpret EXPLAIN plans, identify performance bottlenecks (e.g., slow queries, missing indexes), and suggest specific optimization strategies. It is useful for improving database query execution speed and resource utilization.

providing-performance-optimization-advice

25
from ComeOnOliver/skillshub

Provide comprehensive prioritized performance optimization recommendations for frontend, backend, and infrastructure. Use when analyzing bottlenecks or seeking improvement strategies. Trigger with phrases like "optimize performance", "improve speed", or "performance recommendations".

profiling-application-performance

25
from ComeOnOliver/skillshub

Execute this skill enables AI assistant to profile application performance, analyzing cpu usage, memory consumption, and execution time. it is triggered when the user requests performance analysis, bottleneck identification, or optimization recommendations. the... Use when optimizing performance. Trigger with phrases like 'optimize', 'performance', or 'speed up'.

performance-testing

25
from ComeOnOliver/skillshub

This skill enables Claude to design, execute, and analyze performance tests using the performance-test-suite plugin. It is activated when the user requests load testing, stress testing, spike testing, or endurance testing, and when discussing performance metrics such as response time, throughput, and error rates. It identifies performance bottlenecks related to CPU, memory, database, or network issues. The plugin provides comprehensive reporting, including percentiles, graphs, and recommendations.

detecting-performance-regressions

25
from ComeOnOliver/skillshub

This skill enables Claude to automatically detect performance regressions in a CI/CD pipeline. It analyzes performance metrics, such as response time and throughput, and compares them against baselines or thresholds. Use this skill when the user requests to "detect performance regressions", "analyze performance metrics for regressions", or "find performance degradation" in a CI/CD environment. The skill is also triggered when the user mentions "baseline comparison", "statistical significance analysis", or "performance budget violations". It helps identify and report performance issues early in the development cycle.

performance-lighthouse-runner

25
from ComeOnOliver/skillshub

Performance Lighthouse Runner - Auto-activating skill for Frontend Development. Triggers on: performance lighthouse runner, performance lighthouse runner Part of the Frontend Development skill category.

performance-baseline-creator

25
from ComeOnOliver/skillshub

Performance Baseline Creator - Auto-activating skill for Performance Testing. Triggers on: performance baseline creator, performance baseline creator Part of the Performance Testing skill category.

optimizing-cache-performance

25
from ComeOnOliver/skillshub

Execute this skill enables AI assistant to analyze and improve application caching strategies. it optimizes cache hit rates, ttl configurations, cache key design, and invalidation strategies. use this skill when the user requests to "optimize cache performance"... Use when optimizing performance. Trigger with phrases like 'optimize', 'performance', or 'speed up'.

aggregating-performance-metrics

25
from ComeOnOliver/skillshub

This skill enables Claude to aggregate and centralize performance metrics from various sources. It is used when the user needs to consolidate metrics from applications, systems, databases, caches, queues, and external services into a central location for monitoring and analysis. The skill is triggered by requests to "aggregate metrics", "centralize performance metrics", or similar phrases related to metrics aggregation and monitoring. It facilitates designing a metrics taxonomy, choosing appropriate aggregation tools, and setting up dashboards and alerts.

memory-profiler-setup

25
from ComeOnOliver/skillshub

Memory Profiler Setup - Auto-activating skill for Performance Testing. Triggers on: memory profiler setup, memory profiler setup Part of the Performance Testing skill category.

inference-latency-profiler

25
from ComeOnOliver/skillshub

Inference Latency Profiler - Auto-activating skill for ML Deployment. Triggers on: inference latency profiler, inference latency profiler Part of the ML Deployment skill category.