prometheus-monitoring
Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.
Best use case
prometheus-monitoring is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.
Teams using prometheus-monitoring should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/prometheus-monitoring/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How prometheus-monitoring Compares
| Feature / Agent | prometheus-monitoring | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Set up Prometheus monitoring for applications with custom metrics, scraping configurations, and service discovery. Use when implementing time-series metrics collection, monitoring applications, or building observability infrastructure.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Prometheus Monitoring
## Overview
Implement comprehensive Prometheus monitoring infrastructure for collecting, storing, and querying time-series metrics from applications and infrastructure.
## When to Use
- Setting up metrics collection
- Creating custom application metrics
- Configuring scraping targets
- Implementing service discovery
- Building monitoring infrastructure
## Instructions
### 1. **Prometheus Configuration**
```yaml
# /etc/prometheus/prometheus.yml
global:
scrape_interval: 15s
evaluation_interval: 15s
external_labels:
cluster: production
alerting:
alertmanagers:
- static_configs:
- targets: ["localhost:9093"]
rule_files:
- "/etc/prometheus/alert_rules.yml"
scrape_configs:
- job_name: "prometheus"
static_configs:
- targets: ["localhost:9090"]
- job_name: "node"
static_configs:
- targets: ["localhost:9100"]
- job_name: "api-service"
static_configs:
- targets: ["localhost:8080/metrics"]
scrape_interval: 10s
- job_name: "kubernetes-pods"
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: "true"
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
```
### 2. **Node.js Metrics Implementation**
```javascript
// metrics.js
const promClient = require("prom-client");
const register = new promClient.Registry();
promClient.collectDefaultMetrics({ register });
const httpRequestDuration = new promClient.Histogram({
name: "http_request_duration_seconds",
help: "HTTP request duration",
labelNames: ["method", "route", "status_code"],
buckets: [0.1, 0.5, 1, 2, 5],
registers: [register],
});
const requestsTotal = new promClient.Counter({
name: "requests_total",
help: "Total requests",
labelNames: ["method", "route", "status_code"],
registers: [register],
});
// Express middleware
const express = require("express");
const app = express();
app.get("/metrics", (req, res) => {
res.set("Content-Type", register.contentType);
res.end(register.metrics());
});
app.use((req, res, next) => {
const start = Date.now();
res.on("finish", () => {
const duration = (Date.now() - start) / 1000;
httpRequestDuration
.labels(req.method, req.path, res.statusCode)
.observe(duration);
requestsTotal.labels(req.method, req.path, res.statusCode).inc();
});
next();
});
module.exports = { register, httpRequestDuration, requestsTotal };
```
### 3. **Python Prometheus Integration**
```python
from prometheus_client import Counter, Histogram, start_http_server
from flask import Flask, request
import time
app = Flask(__name__)
request_count = Counter('requests_total', 'Total requests', ['method', 'endpoint'])
request_duration = Histogram('request_duration_seconds', 'Request duration', ['method', 'endpoint'])
@app.before_request
def before():
request.start_time = time.time()
@app.after_request
def after(response):
duration = time.time() - request.start_time
request_count.labels(request.method, request.path).inc()
request_duration.labels(request.method, request.path).observe(duration)
return response
if __name__ == '__main__':
start_http_server(8000)
app.run(port=5000)
```
### 4. **Alert Rules**
```yaml
# /etc/prometheus/alert_rules.yml
groups:
- name: application
rules:
- alert: HighErrorRate
expr: rate(requests_total{status_code=~"5.."}[5m]) > 0.05
for: 5m
labels:
severity: critical
annotations:
summary: "High error rate: {{ $value }}"
- alert: HighLatency
expr: histogram_quantile(0.95, request_duration_seconds) > 1
for: 10m
labels:
severity: warning
annotations:
summary: "p95 latency: {{ $value }}s"
- alert: HighMemoryUsage
expr: node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes < 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "Low memory: {{ $value }}"
```
### 5. **Docker Compose Setup**
```yaml
version: "3.8"
services:
prometheus:
image: prom/prometheus:latest
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alert_rules.yml:/etc/prometheus/alert_rules.yml
- prometheus_data:/prometheus
command:
- "--config.file=/etc/prometheus/prometheus.yml"
- "--storage.tsdb.path=/prometheus"
- "--storage.tsdb.retention.time=30d"
node-exporter:
image: prom/node-exporter:latest
ports:
- "9100:9100"
volumes:
prometheus_data:
```
## Best Practices
### ✅ DO
- Use consistent metric naming conventions
- Add comprehensive labels for filtering
- Set appropriate scrape intervals (10-60s)
- Implement retention policies
- Monitor Prometheus itself
- Test alert rules before deployment
- Document metric meanings
### ❌ DON'T
- Add unbounded cardinality labels
- Scrape too frequently (< 10s)
- Ignore metric naming conventions
- Create alerts without runbooks
- Store raw event data in Prometheus
- Use counters for gauge-like values
## Key Prometheus Queries
```promql
rate(requests_total[5m]) # Request rate
histogram_quantile(0.95, request_duration_seconds) # p95 latency
rate(requests_total{status_code=~"5.."}[5m]) # Error rate
```Related Skills
prometheus-configuration
Set up Prometheus for comprehensive metric collection, storage, and monitoring of infrastructure and applications. Use when implementing metrics collection, setting up monitoring infrastructure, or...
operational-sla-monitoring
Track, analyze, and explain operational SLA performance for banking operations functions. Use when monitoring SLA compliance, investigating SLA breaches, producing SLA performance reports, or optimizing service level targets for payment processing, account servicing, lending operations, and customer service functions.
observability-monitoring-slo-implement
You are an SLO (Service Level Objective) expert specializing in implementing reliability standards and error budget-based practices. Design SLO frameworks, define SLIs, and build monitoring that ba...
observability-monitoring-observability-engineer
Build production-ready monitoring, logging, and tracing systems. Implements comprehensive observability strategies, SLI/SLO management, and incident response workflows. Use PROACTIVELY for monitoring infrastructure, performance optimization, or production reliability. Use when: the task directly matches observability engineer responsibilities within plugin observability-monitoring. Do not use when: a more specific framework or task-focused skill is clearly a better match.
observability-monitoring-monitor-setup
You are a monitoring and observability expert specializing in implementing comprehensive monitoring solutions. Set up metrics collection, distributed tracing, log aggregation, and create insightful da
monitoring
Set up observability for applications and infrastructure with metrics, logs, traces, and alerts.
monitoring-observability
Monitoring and observability patterns for Prometheus metrics, Grafana dashboards, Langfuse LLM tracing, and drift detection. Use when adding logging, metrics, distributed tracing, LLM cost tracking, or quality drift monitoring.
grafana-prometheus
Observability and monitoring with Prometheus metrics and Grafana dashboards
alerting-and-monitoring
Define alerts, escalation, and incident response.
observability-monitoring-performance-engineer
Expert performance engineer specializing in modern observability, application optimization, and scalable system performance. Masters OpenTelemetry, distributed tracing, load testing, multi-tier caching, Core Web Vitals, and performance monitoring. Handles end-to-end optimization, real user monitoring, and scalability patterns. Use PROACTIVELY for performance optimization, observability, or scalability challenges. Use when: the task directly matches performance engineer responsibilities within plugin observability-monitoring. Do not use when: a more specific framework or task-focused skill is clearly a better match.
blazemeter-api-monitoring
Comprehensive guide for BlazeMeter API Monitoring, including test creation, configuration, scripting, integrations, notifications, and management. Use when working with API Monitoring tests for (1) Creating and configuring API tests, (2) Writing custom scripts (Initial, Pre-request, Post-response), (3) Integrating with third-party services (Slack, PagerDuty, Datadog, etc.), (4) Managing teams, buckets, and RBAC, (5) Configuring notifications and sharing results, (6) Using test data (CSV, Data Entities), (7) Advanced features (GraphQL, SOAP, file uploads, environments), or any other API Monitoring tasks.
sentry-setup-ai-monitoring
Setup Sentry AI Agent Monitoring in any project. Use this when asked to add AI monitoring, track LLM calls, monitor AI agents, or instrument OpenAI/Anthropic/Vercel AI/LangChain/Google GenAI. Automatically detects installed AI SDKs and configures the appropriate Sentry integration.