Gatus — Lightweight Health Check Dashboard

## Overview

25 stars

Best use case

Gatus — Lightweight Health Check Dashboard is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Overview

Teams using Gatus — Lightweight Health Check Dashboard should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/gatus/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/TerminalSkills/skills/gatus/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/gatus/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How Gatus — Lightweight Health Check Dashboard Compares

Feature / AgentGatus — Lightweight Health Check DashboardStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

## Overview

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Gatus — Lightweight Health Check Dashboard


## Overview


Gatus, the lightweight, self-hosted health check and status page tool written in Go. Helps developers set up endpoint monitoring with conditions, alerting, and a beautiful status page — all configured via a single YAML file with no database required.


## Instructions

### Configuration

```yaml
# config.yaml — Complete Gatus configuration
# Single YAML file defines all endpoints, conditions, and alerts.

# Storage (optional — for persistence across restarts)
storage:
  type: sqlite
  path: /data/gatus.db

# Web UI settings
web:
  port: 8080

# Alert providers
alerting:
  slack:
    webhook-url: "${SLACK_WEBHOOK_URL}"
    default-alert:
      enabled: true
      failure-threshold: 3             # Alert after 3 consecutive failures
      success-threshold: 2             # Recover after 2 consecutive successes
      send-on-resolved: true
  
  pagerduty:
    integration-key: "${PAGERDUTY_KEY}"
    default-alert:
      enabled: true
      failure-threshold: 5
      send-on-resolved: true

  email:
    from: "gatus@example.com"
    host: "smtp.example.com"
    port: 587
    username: "${SMTP_USER}"
    password: "${SMTP_PASS}"
    default-alert:
      enabled: false                    # Only enable on critical endpoints

# Endpoints to monitor
endpoints:
  # --- API Health ---
  - name: API Gateway
    group: backend
    url: "https://api.example.com/health"
    interval: 30s
    conditions:
      - "[STATUS] == 200"
      - "[RESPONSE_TIME] < 2000"       # Under 2 seconds
      - "[BODY].status == healthy"     # JSON body check
    alerts:
      - type: slack
      - type: pagerduty

  - name: Auth Service
    group: backend
    url: "https://api.example.com/auth/health"
    interval: 30s
    conditions:
      - "[STATUS] == 200"
      - "[RESPONSE_TIME] < 1000"

  # --- Frontend ---
  - name: Website
    group: frontend
    url: "https://example.com"
    interval: 60s
    conditions:
      - "[STATUS] == 200"
      - "[RESPONSE_TIME] < 3000"
      - "[BODY] contains Welcome"      # Verify page renders
    alerts:
      - type: slack

  # --- Database ---
  - name: PostgreSQL
    group: infrastructure
    url: "tcp://db.example.com:5432"
    interval: 30s
    conditions:
      - "[CONNECTED] == true"
    alerts:
      - type: slack
      - type: pagerduty

  # --- Redis ---
  - name: Redis
    group: infrastructure
    url: "tcp://redis.example.com:6379"
    interval: 15s
    conditions:
      - "[CONNECTED] == true"

  # --- DNS ---
  - name: DNS Resolution
    group: infrastructure
    url: "dns://8.8.8.8"
    dns:
      query-name: "example.com"
      query-type: "A"
    conditions:
      - "[DNS_RCODE] == NOERROR"
      - "[RESPONSE_TIME] < 500"

  # --- SSL Certificate ---
  - name: SSL Certificate
    group: security
    url: "https://example.com"
    interval: 1h
    conditions:
      - "[CERTIFICATE_EXPIRATION] > 720h"    # Alert if < 30 days

  # --- External Dependencies ---
  - name: Stripe API
    group: external
    url: "https://api.stripe.com/v1"
    interval: 5m
    conditions:
      - "[STATUS] == 401"               # Unauthenticated is expected (API is up)
      - "[RESPONSE_TIME] < 3000"

  # --- GraphQL ---
  - name: GraphQL API
    group: backend
    url: "https://api.example.com/graphql"
    method: POST
    headers:
      Content-Type: application/json
    body: '{"query": "{ __typename }"}'
    interval: 30s
    conditions:
      - "[STATUS] == 200"
      - "[BODY].data.__typename == Query"
```

### Deployment

```yaml
# docker-compose.yml — Self-hosted Gatus
version: "3.8"
services:
  gatus:
    image: twinproduction/gatus:latest
    ports:
      - "8080:8080"
    volumes:
      - ./config.yaml:/config/config.yaml
      - gatus-data:/data
    environment:
      - SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL}
      - PAGERDUTY_KEY=${PAGERDUTY_KEY}
    restart: unless-stopped

volumes:
  gatus-data:
```

```yaml
# Kubernetes deployment
apiVersion: apps/v1
kind: Deployment
metadata:
  name: gatus
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gatus
  template:
    metadata:
      labels:
        app: gatus
    spec:
      containers:
        - name: gatus
          image: twinproduction/gatus:latest
          ports:
            - containerPort: 8080
          volumeMounts:
            - name: config
              mountPath: /config
          envFrom:
            - secretRef:
                name: gatus-secrets
      volumes:
        - name: config
          configMap:
            name: gatus-config
```

## Installation

```bash
# Docker
docker run -p 8080:8080 -v $(pwd)/config.yaml:/config/config.yaml twinproduction/gatus

# Binary
go install github.com/TwiN/gatus/v5@latest

# Helm
helm repo add gatus https://twin.github.io/helm-charts
helm install gatus gatus/gatus
```


## Examples


### Example 1: Setting up Gatus for a microservices project

**User request:**

```
I have a Node.js API and a React frontend running in Docker. Set up Gatus for monitoring/deployment.
```

The agent creates the necessary configuration files based on patterns like `# config.yaml — Complete Gatus configuration`, sets up the integration with the existing Docker setup, configures appropriate defaults for a Node.js + React stack, and provides verification commands to confirm everything is working.

### Example 2: Troubleshooting deployment issues

**User request:**

```
Gatus is showing errors in our deployment. Here are the logs: [error output]
```

The agent analyzes the error output, identifies the root cause by cross-referencing with common Gatus issues, applies the fix (updating configuration, adjusting resource limits, or correcting syntax), and verifies the resolution with appropriate health checks.


## Guidelines

1. **Single YAML** — Keep all monitoring in one config file; version control it alongside your infrastructure
2. **Group endpoints** — Use groups (backend, frontend, infrastructure) for organized status page display
3. **Multiple conditions** — Check status code AND response time AND body content; a 200 with wrong content is still broken
4. **Failure thresholds** — Set failure-threshold to 2-3 to avoid false alarms from transient network blips
5. **Check dependencies** — Monitor external services (Stripe, AWS) separately; know when the issue is upstream vs yours
6. **SSL monitoring** — Check certificate expiration weekly; alert at 30 days to give time for renewal
7. **Lightweight deployment** — Gatus uses ~15MB RAM; run it on any machine, even a Raspberry Pi
8. **SQLite for history** — Enable SQLite storage for uptime history across restarts; no external database needed

Related Skills

brightdata-prod-checklist

25
from ComeOnOliver/skillshub

Execute Bright Data production deployment checklist and rollback procedures. Use when deploying Bright Data integrations to production, preparing for launch, or implementing go-live procedures. Trigger with phrases like "brightdata production", "deploy brightdata", "brightdata go-live", "brightdata launch checklist".

bamboohr-prod-checklist

25
from ComeOnOliver/skillshub

Execute BambooHR production deployment checklist and rollback procedures. Use when deploying BambooHR integrations to production, preparing for launch, or implementing go-live procedures with BambooHR API. Trigger with phrases like "bamboohr production", "deploy bamboohr", "bamboohr go-live", "bamboohr launch checklist", "bamboohr prod ready".

attio-prod-checklist

25
from ComeOnOliver/skillshub

Production readiness checklist for Attio API integrations -- auth, error handling, rate limits, health checks, monitoring, and rollback. Trigger: "attio production", "deploy attio", "attio go-live", "attio launch checklist", "attio production ready".

assemblyai-prod-checklist

25
from ComeOnOliver/skillshub

Execute AssemblyAI production deployment checklist and rollback procedures. Use when deploying AssemblyAI integrations to production, preparing for launch, or implementing go-live procedures for transcription services. Trigger with phrases like "assemblyai production", "deploy assemblyai", "assemblyai go-live", "assemblyai launch checklist".

apple-notes-prod-checklist

25
from ComeOnOliver/skillshub

Production checklist for Apple Notes automation deployments. Trigger: "apple notes production checklist".

appfolio-prod-checklist

25
from ComeOnOliver/skillshub

Production readiness checklist for AppFolio integrations. Trigger: "appfolio production checklist".

apollo-prod-checklist

25
from ComeOnOliver/skillshub

Execute Apollo.io production deployment checklist. Use when preparing to deploy Apollo integrations to production, doing pre-launch verification, or auditing production readiness. Trigger with phrases like "apollo production checklist", "deploy apollo", "apollo go-live", "apollo production ready", "apollo launch checklist".

creating-apm-dashboards

25
from ComeOnOliver/skillshub

This skill enables Claude to create Application Performance Monitoring (APM) dashboards. It is triggered when the user requests the creation of a new APM dashboard, monitoring dashboard, or a dashboard for application performance. The skill helps define key metrics and visualizations for monitoring application health, performance, and user experience across multiple platforms like Grafana and Datadog. Use this skill when the user needs assistance setting up a new monitoring solution or expanding an existing one. The plugin supports the creation of dashboards focusing on golden signals, request metrics, resource utilization, database metrics, cache metrics, business metrics, and error tracking.

apify-prod-checklist

25
from ComeOnOliver/skillshub

Production readiness checklist for Apify Actor deployments. Use when deploying Actors to production, preparing for launch, or validating Actor configuration before going live. Trigger: "apify production", "deploy actor to prod", "apify go-live", "apify launch checklist", "actor production ready".

anth-prod-checklist

25
from ComeOnOliver/skillshub

Execute production deployment checklist for Claude API integrations. Use when deploying Claude-powered features to production, preparing for launch, or implementing go-live validation. Trigger with phrases like "anthropic production", "deploy claude", "claude go-live", "anthropic launch checklist", "production ready claude".

anima-prod-checklist

25
from ComeOnOliver/skillshub

Production readiness checklist for Anima design-to-code pipelines. Use when deploying automated design-to-code services, preparing CI/CD Figma-to-code automation, or validating output quality before production. Trigger: "anima production", "anima go-live", "anima prod checklist".

algolia-prod-checklist

25
from ComeOnOliver/skillshub

Execute Algolia production readiness checklist: index settings, key security, replica configuration, monitoring, and rollback procedures. Trigger: "algolia production", "deploy algolia", "algolia go-live", "algolia launch checklist", "algolia production ready".