ClaudeDevOps & Infrastructure

error-diagnostics-error-trace

You are an error tracking and observability expert specializing in implementing comprehensive error monitoring solutions. Set up error tracking systems, configure alerts, implement structured logging,

31,392 stars

bysickn33

Complexity: easy

View on GitHub Installation ↓

About this skill

This skill transforms the AI agent into a seasoned error tracking and observability expert. It empowers the agent to offer comprehensive guidance, best practices, and actionable advice for implementing robust error monitoring solutions. The agent can provide insights on setting up error tracking systems, configuring effective alerts, implementing structured logging methodologies, and ensuring development teams can quickly identify and resolve production issues. It serves as a go-to resource for expert knowledge in maintaining system health and reliability.

Best use case

Users need expert advice or guidance on establishing or improving error monitoring systems, configuring alerts, implementing structured logging, or understanding observability best practices to resolve production issues.

Users will receive expert-level advice, clear best practices, actionable checklists, and comprehensive guidance related to error tracking, monitoring, structured logging, and observability, enabling them to make informed decisions and implement effective solutions.

Practical example

Example input

As the error tracking expert, what are the essential steps to set up a comprehensive error monitoring system for a new microservices application using modern best practices?

Example output

To set up a comprehensive error monitoring system for a new microservices application, follow these essential steps:

1. **Choose a Monitoring Stack:** Select tools like Prometheus/Grafana, ELK Stack (Elasticsearch, Logstash, Kibana), Datadog, New Relic, Sentry, or Honeycomb. Consider your budget, existing infrastructure, and team's familiarity.
2. **Implement Structured Logging:** Ensure all services emit logs in a structured format (e.g., JSON). Include critical metadata like `timestamp`, `service_name`, `environment`, `log_level`, `request_id`, `user_id`, `error_code`, `stack_trace` (if an error), etc. This makes logs searchable and analyzable.
3. **Centralized Log Aggregation:** Use a log aggregator (e.g., Logstash, Fluentd, Vector) to collect logs from all microservices and forward them to your chosen monitoring stack's central storage (e.g., Elasticsearch, S3, cloud logging services).
4. **Error Tracking Integration:** Integrate a dedicated error tracking service (e.g., Sentry, Bugsnag, Rollbar) into each microservice. This captures unhandled exceptions, provides stack traces, and aggregates similar errors.
5. **Metrics Collection:** Collect key performance indicators (KPIs) and operational metrics from each service. This includes request rates, error rates, latency, CPU/memory usage, database query times. Tools like Prometheus or a cloud-native monitoring service are ideal.
6. **Distributed Tracing:** Implement distributed tracing (e.g., OpenTelemetry, Jaeger, Zipkin) to visualize the flow of requests across multiple microservices. This is crucial for diagnosing latency issues and understanding service dependencies.
7. **Alerting & Notifications:** Configure intelligent alerts based on critical metrics and error thresholds. Examples: 'high error rate on Service X', 'latency spike on API Y', 'database connection errors'. Route notifications to appropriate teams via Slack, PagerDuty, email, etc.
8. **Dashboards & Visualization:** Create intuitive dashboards (e.g., Grafana, Kibana, Datadog) that provide a holistic view of your system's health, key metrics, error trends, and service dependencies.
9. **Runbooks & Playbooks:** Document clear runbooks for common alerts and incidents, outlining steps for diagnosis and resolution.
10. **Regular Review & Iteration:** Continuously review your monitoring setup. Are alerts effective? Are dashboards providing useful insights? Adjust based on incidents and evolving application needs.

By following these steps, you'll establish a robust error monitoring system capable of providing deep insights into your microservices application's health and performance.

When to use this skill

When working on error tracking and monitoring tasks or workflows.
When needing guidance, best practices, or checklists for error tracking and monitoring.
When seeking expert advice on system observability and reliability.
When designing or refining error management strategies for software applications.

When not to use this skill

This skill is not designed for direct execution of code, interaction with external APIs to set up monitoring tools, or real-time diagnostics on live systems. Its purpose is advisory and informational, not for performing direct system operations.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/error-diagnostics-error-trace/SKILL.md --create-dirs "https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/main/plugins/antigravity-awesome-skills-claude/skills/error-diagnostics-error-trace/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/error-diagnostics-error-trace/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How error-diagnostics-error-trace Compares

Feature / Agent	error-diagnostics-error-trace	Standard Approach
Platform Support	Claude	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	easy	N/A

Frequently Asked Questions

What does this skill do?

Which AI agents support this skill?

This skill is designed for Claude.

How difficult is it to install?

The installation complexity is rated as easy. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Top AI Agents for Productivity

See the top AI agent skills for productivity, workflow automation, operational systems, documentation, and everyday task execution.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

SKILL.md Source

# Error Tracking and Monitoring

## Use this skill when

- Working on error tracking and monitoring tasks or workflows
- Needing guidance, best practices, or checklists for error tracking and monitoring

## Do not use this skill when

- The task is unrelated to error tracking and monitoring
- You need a different domain or tool outside this scope

## Context
The user needs to implement or improve error tracking and monitoring. Focus on real-time error detection, meaningful alerts, error grouping, performance monitoring, and integration with popular error tracking services.

## Requirements
$ARGUMENTS

## Instructions

- Clarify goals, constraints, and required inputs.
- Apply relevant best practices and validate outcomes.
- Provide actionable steps and verification.
- If detailed examples are required, open `resources/implementation-playbook.md`.

## Output Format

31392

from sickn33/antigravity-awesome-skills

DevOps e deploy de aplicacoes — Docker, CI/CD com GitHub Actions, AWS Lambda, SAM, Terraform, infraestrutura como codigo e monitoramento.

DevOps & InfrastructureClaudeCursorGemini