Gatus — Lightweight Health Check Dashboard
## Overview
Best use case
Gatus — Lightweight Health Check Dashboard is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
## Overview
Teams using Gatus — Lightweight Health Check Dashboard should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/gatus/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Gatus — Lightweight Health Check Dashboard Compares
| Feature / Agent | Gatus — Lightweight Health Check Dashboard | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
## Overview
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Gatus — Lightweight Health Check Dashboard
## Overview
Gatus, the lightweight, self-hosted health check and status page tool written in Go. Helps developers set up endpoint monitoring with conditions, alerting, and a beautiful status page — all configured via a single YAML file with no database required.
## Instructions
### Configuration
```yaml
# config.yaml — Complete Gatus configuration
# Single YAML file defines all endpoints, conditions, and alerts.
# Storage (optional — for persistence across restarts)
storage:
type: sqlite
path: /data/gatus.db
# Web UI settings
web:
port: 8080
# Alert providers
alerting:
slack:
webhook-url: "${SLACK_WEBHOOK_URL}"
default-alert:
enabled: true
failure-threshold: 3 # Alert after 3 consecutive failures
success-threshold: 2 # Recover after 2 consecutive successes
send-on-resolved: true
pagerduty:
integration-key: "${PAGERDUTY_KEY}"
default-alert:
enabled: true
failure-threshold: 5
send-on-resolved: true
email:
from: "gatus@example.com"
host: "smtp.example.com"
port: 587
username: "${SMTP_USER}"
password: "${SMTP_PASS}"
default-alert:
enabled: false # Only enable on critical endpoints
# Endpoints to monitor
endpoints:
# --- API Health ---
- name: API Gateway
group: backend
url: "https://api.example.com/health"
interval: 30s
conditions:
- "[STATUS] == 200"
- "[RESPONSE_TIME] < 2000" # Under 2 seconds
- "[BODY].status == healthy" # JSON body check
alerts:
- type: slack
- type: pagerduty
- name: Auth Service
group: backend
url: "https://api.example.com/auth/health"
interval: 30s
conditions:
- "[STATUS] == 200"
- "[RESPONSE_TIME] < 1000"
# --- Frontend ---
- name: Website
group: frontend
url: "https://example.com"
interval: 60s
conditions:
- "[STATUS] == 200"
- "[RESPONSE_TIME] < 3000"
- "[BODY] contains Welcome" # Verify page renders
alerts:
- type: slack
# --- Database ---
- name: PostgreSQL
group: infrastructure
url: "tcp://db.example.com:5432"
interval: 30s
conditions:
- "[CONNECTED] == true"
alerts:
- type: slack
- type: pagerduty
# --- Redis ---
- name: Redis
group: infrastructure
url: "tcp://redis.example.com:6379"
interval: 15s
conditions:
- "[CONNECTED] == true"
# --- DNS ---
- name: DNS Resolution
group: infrastructure
url: "dns://8.8.8.8"
dns:
query-name: "example.com"
query-type: "A"
conditions:
- "[DNS_RCODE] == NOERROR"
- "[RESPONSE_TIME] < 500"
# --- SSL Certificate ---
- name: SSL Certificate
group: security
url: "https://example.com"
interval: 1h
conditions:
- "[CERTIFICATE_EXPIRATION] > 720h" # Alert if < 30 days
# --- External Dependencies ---
- name: Stripe API
group: external
url: "https://api.stripe.com/v1"
interval: 5m
conditions:
- "[STATUS] == 401" # Unauthenticated is expected (API is up)
- "[RESPONSE_TIME] < 3000"
# --- GraphQL ---
- name: GraphQL API
group: backend
url: "https://api.example.com/graphql"
method: POST
headers:
Content-Type: application/json
body: '{"query": "{ __typename }"}'
interval: 30s
conditions:
- "[STATUS] == 200"
- "[BODY].data.__typename == Query"
```
### Deployment
```yaml
# docker-compose.yml — Self-hosted Gatus
version: "3.8"
services:
gatus:
image: twinproduction/gatus:latest
ports:
- "8080:8080"
volumes:
- ./config.yaml:/config/config.yaml
- gatus-data:/data
environment:
- SLACK_WEBHOOK_URL=${SLACK_WEBHOOK_URL}
- PAGERDUTY_KEY=${PAGERDUTY_KEY}
restart: unless-stopped
volumes:
gatus-data:
```
```yaml
# Kubernetes deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: gatus
spec:
replicas: 1
selector:
matchLabels:
app: gatus
template:
metadata:
labels:
app: gatus
spec:
containers:
- name: gatus
image: twinproduction/gatus:latest
ports:
- containerPort: 8080
volumeMounts:
- name: config
mountPath: /config
envFrom:
- secretRef:
name: gatus-secrets
volumes:
- name: config
configMap:
name: gatus-config
```
## Installation
```bash
# Docker
docker run -p 8080:8080 -v $(pwd)/config.yaml:/config/config.yaml twinproduction/gatus
# Binary
go install github.com/TwiN/gatus/v5@latest
# Helm
helm repo add gatus https://twin.github.io/helm-charts
helm install gatus gatus/gatus
```
## Examples
### Example 1: Setting up Gatus for a microservices project
**User request:**
```
I have a Node.js API and a React frontend running in Docker. Set up Gatus for monitoring/deployment.
```
The agent creates the necessary configuration files based on patterns like `# config.yaml — Complete Gatus configuration`, sets up the integration with the existing Docker setup, configures appropriate defaults for a Node.js + React stack, and provides verification commands to confirm everything is working.
### Example 2: Troubleshooting deployment issues
**User request:**
```
Gatus is showing errors in our deployment. Here are the logs: [error output]
```
The agent analyzes the error output, identifies the root cause by cross-referencing with common Gatus issues, applies the fix (updating configuration, adjusting resource limits, or correcting syntax), and verifies the resolution with appropriate health checks.
## Guidelines
1. **Single YAML** — Keep all monitoring in one config file; version control it alongside your infrastructure
2. **Group endpoints** — Use groups (backend, frontend, infrastructure) for organized status page display
3. **Multiple conditions** — Check status code AND response time AND body content; a 200 with wrong content is still broken
4. **Failure thresholds** — Set failure-threshold to 2-3 to avoid false alarms from transient network blips
5. **Check dependencies** — Monitor external services (Stripe, AWS) separately; know when the issue is upstream vs yours
6. **SSL monitoring** — Check certificate expiration weekly; alert at 30 days to give time for renewal
7. **Lightweight deployment** — Gatus uses ~15MB RAM; run it on any machine, even a Raspberry Pi
8. **SQLite for history** — Enable SQLite storage for uptime history across restarts; no external database neededRelated Skills
brightdata-prod-checklist
Execute Bright Data production deployment checklist and rollback procedures. Use when deploying Bright Data integrations to production, preparing for launch, or implementing go-live procedures. Trigger with phrases like "brightdata production", "deploy brightdata", "brightdata go-live", "brightdata launch checklist".
bamboohr-prod-checklist
Execute BambooHR production deployment checklist and rollback procedures. Use when deploying BambooHR integrations to production, preparing for launch, or implementing go-live procedures with BambooHR API. Trigger with phrases like "bamboohr production", "deploy bamboohr", "bamboohr go-live", "bamboohr launch checklist", "bamboohr prod ready".
attio-prod-checklist
Production readiness checklist for Attio API integrations -- auth, error handling, rate limits, health checks, monitoring, and rollback. Trigger: "attio production", "deploy attio", "attio go-live", "attio launch checklist", "attio production ready".
assemblyai-prod-checklist
Execute AssemblyAI production deployment checklist and rollback procedures. Use when deploying AssemblyAI integrations to production, preparing for launch, or implementing go-live procedures for transcription services. Trigger with phrases like "assemblyai production", "deploy assemblyai", "assemblyai go-live", "assemblyai launch checklist".
apple-notes-prod-checklist
Production checklist for Apple Notes automation deployments. Trigger: "apple notes production checklist".
appfolio-prod-checklist
Production readiness checklist for AppFolio integrations. Trigger: "appfolio production checklist".
apollo-prod-checklist
Execute Apollo.io production deployment checklist. Use when preparing to deploy Apollo integrations to production, doing pre-launch verification, or auditing production readiness. Trigger with phrases like "apollo production checklist", "deploy apollo", "apollo go-live", "apollo production ready", "apollo launch checklist".
creating-apm-dashboards
This skill enables Claude to create Application Performance Monitoring (APM) dashboards. It is triggered when the user requests the creation of a new APM dashboard, monitoring dashboard, or a dashboard for application performance. The skill helps define key metrics and visualizations for monitoring application health, performance, and user experience across multiple platforms like Grafana and Datadog. Use this skill when the user needs assistance setting up a new monitoring solution or expanding an existing one. The plugin supports the creation of dashboards focusing on golden signals, request metrics, resource utilization, database metrics, cache metrics, business metrics, and error tracking.
apify-prod-checklist
Production readiness checklist for Apify Actor deployments. Use when deploying Actors to production, preparing for launch, or validating Actor configuration before going live. Trigger: "apify production", "deploy actor to prod", "apify go-live", "apify launch checklist", "actor production ready".
anth-prod-checklist
Execute production deployment checklist for Claude API integrations. Use when deploying Claude-powered features to production, preparing for launch, or implementing go-live validation. Trigger with phrases like "anthropic production", "deploy claude", "claude go-live", "anthropic launch checklist", "production ready claude".
anima-prod-checklist
Production readiness checklist for Anima design-to-code pipelines. Use when deploying automated design-to-code services, preparing CI/CD Figma-to-code automation, or validating output quality before production. Trigger: "anima production", "anima go-live", "anima prod checklist".
algolia-prod-checklist
Execute Algolia production readiness checklist: index settings, key security, replica configuration, monitoring, and rollback procedures. Trigger: "algolia production", "deploy algolia", "algolia go-live", "algolia launch checklist", "algolia production ready".