collecting-infrastructure-metrics
This skill enables Claude to collect comprehensive infrastructure performance metrics across compute, storage, network, containers, load balancers, and databases. It is triggered when the user requests "collect infrastructure metrics", "monitor server performance", "set up performance dashboards", or needs to analyze system resource utilization. The skill configures metrics collection, sets up aggregation, and helps create infrastructure dashboards for health monitoring and capacity tracking. It supports configuration for Prometheus, Datadog, and CloudWatch.
Best use case
collecting-infrastructure-metrics is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
This skill enables Claude to collect comprehensive infrastructure performance metrics across compute, storage, network, containers, load balancers, and databases. It is triggered when the user requests "collect infrastructure metrics", "monitor server performance", "set up performance dashboards", or needs to analyze system resource utilization. The skill configures metrics collection, sets up aggregation, and helps create infrastructure dashboards for health monitoring and capacity tracking. It supports configuration for Prometheus, Datadog, and CloudWatch.
Teams using collecting-infrastructure-metrics should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/infrastructure-metrics-collector/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How collecting-infrastructure-metrics Compares
| Feature / Agent | collecting-infrastructure-metrics | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
This skill enables Claude to collect comprehensive infrastructure performance metrics across compute, storage, network, containers, load balancers, and databases. It is triggered when the user requests "collect infrastructure metrics", "monitor server performance", "set up performance dashboards", or needs to analyze system resource utilization. The skill configures metrics collection, sets up aggregation, and helps create infrastructure dashboards for health monitoring and capacity tracking. It supports configuration for Prometheus, Datadog, and CloudWatch.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
## Overview This skill automates the process of setting up infrastructure metrics collection. It identifies key performance indicators (KPIs) across various infrastructure layers, configures agents to collect these metrics, and assists in setting up central aggregation and visualization. ## How It Works 1. **Identify Infrastructure Layers**: Determines the infrastructure layers to monitor (compute, storage, network, containers, load balancers, databases). 2. **Configure Metrics Collection**: Sets up agents (Prometheus, Datadog, CloudWatch) to collect metrics from the identified layers. 3. **Aggregate Metrics**: Configures central aggregation of the collected metrics for analysis and visualization. 4. **Create Dashboards**: Generates infrastructure dashboards for health monitoring, performance analysis, and capacity tracking. ## When to Use This Skill This skill activates when you need to: - Monitor the performance of your infrastructure. - Identify bottlenecks in your system. - Set up dashboards for real-time monitoring. ## Examples ### Example 1: Setting up basic monitoring User request: "Collect infrastructure metrics for my web server." The skill will: 1. Identify compute, storage, and network layers relevant to the web server. 2. Configure Prometheus to collect CPU, memory, disk I/O, and network bandwidth metrics. ### Example 2: Troubleshooting database performance User request: "I'm seeing slow database queries. Can you help me monitor the database performance?" The skill will: 1. Identify the database layer and relevant metrics such as connection pool usage, replication lag, and cache hit rates. 2. Configure Datadog to collect these metrics and create a dashboard to visualize performance trends. ## Best Practices - **Agent Selection**: Choose the appropriate agent (Prometheus, Datadog, CloudWatch) based on your existing infrastructure and monitoring tools. - **Metric Granularity**: Balance the granularity of metrics collection with the storage and processing overhead. Collect only the essential metrics for your use case. - **Alerting**: Configure alerts based on thresholds for key metrics to proactively identify and address performance issues. ## Integration This skill can be integrated with other Claude Code plugins for deployment, configuration management, and alerting to provide a comprehensive infrastructure management solution. For example, it can be used with a deployment plugin to automatically configure metrics collection after deploying new infrastructure.
Related Skills
model-evaluation-metrics
Model Evaluation Metrics - Auto-activating skill for ML Training. Triggers on: model evaluation metrics, model evaluation metrics Part of the ML Training skill category.
aggregating-performance-metrics
This skill enables Claude to aggregate and centralize performance metrics from various sources. It is used when the user needs to consolidate metrics from applications, systems, databases, caches, queues, and external services into a central location for monitoring and analysis. The skill is triggered by requests to "aggregate metrics", "centralize performance metrics", or similar phrases related to metrics aggregation and monitoring. It facilitates designing a metrics taxonomy, choosing appropriate aggregation tools, and setting up dashboards and alerts.
detecting-infrastructure-drift
This skill enables Claude to detect infrastructure drift from a desired state. It uses the `drift-detect` command to identify discrepancies between the current infrastructure configuration and the intended configuration, as defined in infrastructure-as-code tools like Terraform. Use this skill when the user asks to check for infrastructure drift, identify configuration changes, or ensure that the current infrastructure matches the desired state. It is particularly useful in DevOps workflows for maintaining infrastructure consistency and preventing configuration errors. Trigger this skill when the user mentions "drift detection," "infrastructure changes," "configuration drift," or requests a "drift report."
generating-infrastructure-as-code
This skill enables Claude to generate Infrastructure as Code (IaC) configurations. It uses the infrastructure-as-code-generator plugin to create production-ready IaC for Terraform, CloudFormation, Pulumi, ARM Templates, and CDK. Use this skill when the user requests IaC configurations for cloud infrastructure, specifying the platform (e.g., Terraform, CloudFormation) and cloud provider (e.g., AWS, Azure, GCP), or when the user needs help automating infrastructure deployment. Trigger terms include: "generate IaC", "create Terraform", "CloudFormation template", "Pulumi program", "infrastructure code".
checking-infrastructure-compliance
Execute use when you need to work with compliance checking. This skill provides compliance monitoring and validation with comprehensive guidance and automation. Trigger with phrases like "check compliance", "validate policies", or "audit compliance".
import-infrastructure-as-code
Import existing Azure resources into Terraform using Azure CLI discovery and Azure Verified Modules (AVM). Use when asked to reverse-engineer live Azure infrastructure, generate Infrastructure as Code from existing subscriptions/resource groups/resource IDs, map dependencies, derive exact import addresses from downloaded module source, prevent configuration drift, and produce AVM-based Terraform files ready for validation and planning across any Azure resource type.
copilot-usage-metrics
Retrieve and display GitHub Copilot usage metrics for organizations and enterprises using the GitHub CLI and REST API.
saas-metrics-coach
SaaS financial health advisor. Use when a user shares revenue or customer numbers, or mentions ARR, MRR, churn, LTV, CAC, NRR, or asks how their SaaS business is doing.
terraform-infrastructure
Terraform infrastructure as code workflow for provisioning cloud resources, creating reusable modules, and managing infrastructure at scale.
startup-metrics-framework
This skill should be used when the user asks about "key startup metrics", "SaaS metrics", "CAC and LTV", "unit economics", "burn multiple", "rule of 40", "marketplace metrics", or requests guidance on tracking and optimizing business performance metrics.
risk-metrics-calculation
Calculate portfolio risk metrics including VaR, CVaR, Sharpe, Sortino, and drawdown analysis. Use when measuring portfolio risk, implementing risk limits, or building risk monitoring systems.
infrastructure-reporting
Generate comprehensive network infrastructure reports including health status, performance analysis, security audits, and capacity planning recommendations.