tracking-service-reliability

Define and track SLAs, SLIs, and SLOs for service reliability including availability, latency, and error rates. Use when establishing reliability targets or monitoring service health. Trigger with phrases like "define SLOs", "track SLI metrics", or "calculate error budget".

25 stars

Best use case

tracking-service-reliability is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Define and track SLAs, SLIs, and SLOs for service reliability including availability, latency, and error rates. Use when establishing reliability targets or monitoring service health. Trigger with phrases like "define SLOs", "track SLI metrics", or "calculate error budget".

Teams using tracking-service-reliability should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/tracking-service-reliability/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/jeremylongshore/claude-code-plugins-plus-skills/tracking-service-reliability/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/tracking-service-reliability/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How tracking-service-reliability Compares

Feature / Agenttracking-service-reliabilityStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Define and track SLAs, SLIs, and SLOs for service reliability including availability, latency, and error rates. Use when establishing reliability targets or monitoring service health. Trigger with phrases like "define SLOs", "track SLI metrics", or "calculate error budget".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Sla Sli Tracker

Define and track SLAs, SLIs, and SLOs for service reliability including availability targets, latency budgets, error rate thresholds, and error budget burn rates.

## Overview

This skill provides a structured approach to defining and tracking SLAs, SLIs, and SLOs, which are essential for ensuring service reliability. It automates the process of setting performance targets and monitoring actual performance, enabling proactive identification and resolution of potential issues.

## How It Works

1. **SLI Definition**: The skill guides the user to define Service Level Indicators (SLIs) such as availability, latency, error rate, and throughput.
2. **SLO Target Setting**: The skill assists in setting Service Level Objectives (SLOs) by establishing target values for the defined SLIs (e.g., 99.9% availability).
3. **SLA Establishment**: The skill helps in formalizing Service Level Agreements (SLAs), which are customer-facing commitments based on the defined SLOs.

## When to Use This Skill

This skill activates when you need to:
- Define SLAs, SLIs, and SLOs for a service.
- Track service performance against defined objectives.
- Calculate error budgets based on SLOs.

## Examples

### Example 1: Defining SLOs for a New Service

User request: "Create SLOs for our new payment processing service."

The skill will:
1. Prompt the user to define SLIs (e.g., latency, error rate).
2. Assist in setting target values for each SLI (e.g., p99 latency < 100ms, error rate < 0.01%).

### Example 2: Tracking Availability

User request: "Track the availability SLI for the database service."

The skill will:
1. Guide the user in setting up the tracking of the availability SLI.
2. Visualize availability performance against the defined SLO.

## Best Practices

- **Granularity**: Define SLIs that are specific and measurable.
- **Realism**: Set SLOs that are challenging but achievable.
- **Alignment**: Ensure SLAs align with the defined SLOs and business requirements.

## Integration

This skill can be integrated with monitoring tools to automatically collect SLI data and track performance against SLOs. It can also be used in conjunction with alerting systems to trigger notifications when SLO violations occur.

## Prerequisites

- SLI definitions stored in ${CLAUDE_SKILL_DIR}/slos/sli-definitions.yaml
- Access to monitoring and metrics systems
- Historical performance data for baseline
- Business requirements for service reliability

## Instructions

1. Define Service Level Indicators (availability, latency, error rate, throughput)
2. Set Service Level Objectives with target values (e.g., 99.9% availability)
3. Formalize Service Level Agreements with customer commitments
4. Configure automated SLI data collection
5. Calculate error budgets based on SLOs
6. Track performance and alert on SLO violations

## Output

- SLI/SLO/SLA definition documents
- Real-time SLI metric dashboards
- Error budget calculations and burn rate
- SLO compliance reports
- Alerting configurations for violations

## Error Handling

If SLI/SLO tracking fails:
- Verify SLI definition completeness
- Check metric collection infrastructure
- Validate data accuracy and granularity
- Ensure alerting system connectivity
- Review error budget calculation logic

## Resources

- Google SRE book on SLIs and SLOs
- Error budget implementation guides
- Service reliability engineering practices
- SLO definition templates and examples

Related Skills

tracking-token-launches

25
from ComeOnOliver/skillshub

Track new token launches across DEXes with risk analysis and contract verification. Use when discovering new token launches, monitoring IDOs, or analyzing token contracts. Trigger with phrases like "track launches", "find new tokens", "new pairs on uniswap", "token risk analysis", or "monitor IDOs".

tracking-resource-usage

25
from ComeOnOliver/skillshub

Track and optimize resource usage across application stack including CPU, memory, disk, and network I/O. Use when identifying bottlenecks or optimizing costs. Trigger with phrases like "track resource usage", "monitor CPU and memory", or "optimize resource allocation".

tracking-model-versions

25
from ComeOnOliver/skillshub

Build this skill enables AI assistant to track and manage ai/ml model versions using the model-versioning-tracker plugin. it should be used when the user asks to manage model versions, track model lineage, log model performance, or implement version control f... Use when appropriate context detected. Trigger with relevant phrases based on skill purpose.

tracking-crypto-prices

25
from ComeOnOliver/skillshub

Track real-time cryptocurrency prices across exchanges with historical data and alerts. Provides price data infrastructure for dependent skills (portfolio, tax, DeFi, arbitrage). Use when checking crypto prices, monitoring markets, or fetching historical price data. Trigger with phrases like "check price", "BTC price", "crypto prices", "price history", "get quote for", "what's ETH trading at", "show me top coins", or "track my watchlist".

tracking-crypto-portfolio

25
from ComeOnOliver/skillshub

Track cryptocurrency portfolio with real-time valuations, allocation analysis, and P&L tracking. Use when checking portfolio value, viewing holdings breakdown, analyzing allocations, or exporting portfolio data. Trigger with phrases like "show my portfolio", "check crypto holdings", "portfolio allocation", "track my crypto", or "export portfolio".

tracking-crypto-derivatives

25
from ComeOnOliver/skillshub

Track cryptocurrency futures, options, and perpetual swaps with funding rates, open interest, liquidations, and comprehensive derivatives market analysis. Use when monitoring derivatives markets, analyzing funding rates, tracking open interest, finding liquidation levels, or researching options flow. Trigger with phrases like "funding rate", "open interest", "perpetual swap", "futures basis", "liquidation levels", "options flow", "put call ratio", "derivatives analysis", or "BTC perps".

tracking-application-response-times

25
from ComeOnOliver/skillshub

Track and optimize application response times across API endpoints, database queries, and service calls. Use when monitoring performance or identifying bottlenecks. Trigger with phrases like "track response times", "monitor API performance", or "analyze latency".

setting-up-experiment-tracking

25
from ComeOnOliver/skillshub

Implement machine learning experiment tracking using MLflow or Weights & Biases. Configures environment and provides code for logging parameters, metrics, and artifacts. Use when asked to "setup experiment tracking" or "initialize MLflow". Trigger with relevant phrases based on skill purpose.

configuring-service-meshes

25
from ComeOnOliver/skillshub

This skill configures service meshes like Istio and Linkerd for microservices. It generates production-ready configurations, implements best practices, and ensures a security-first approach. Use this skill when the user asks to "configure service mesh", "setup Istio", "setup Linkerd", or requests assistance with "service mesh configuration" for their microservices architecture. The configurations will be tailored to the specified infrastructure requirements.

service-account-manager

25
from ComeOnOliver/skillshub

Service Account Manager - Auto-activating skill for GCP Skills. Triggers on: service account manager, service account manager Part of the GCP Skills skill category.

tracking-regression-tests

25
from ComeOnOliver/skillshub

This skill enables Claude to track and run regression tests, ensuring new changes don't break existing functionality. It is triggered when the user asks to "track regression", "run regression tests", or uses the shortcut "reg". The skill helps in maintaining code stability by identifying critical tests, automating their execution, and analyzing the impact of changes. It also provides insights into test history and identifies flaky tests. The skill uses the `regression-test-tracker` plugin.

mlflow-tracking-setup

25
from ComeOnOliver/skillshub

Mlflow Tracking Setup - Auto-activating skill for ML Training. Triggers on: mlflow tracking setup, mlflow tracking setup Part of the ML Training skill category.