canary-watch
Use this skill to monitor a deployed URL for regressions after deploys, merges, or dependency upgrades.
About this skill
The `canary-watch` skill is a critical tool for maintaining application stability and performance after crucial changes. Designed for the Claude AI agent, it automatically monitors a specified deployed URL to detect regressions across multiple dimensions. This includes verifying HTTP status (e.g., 200 OK), identifying new console errors, detecting network failures (like failed API calls or 5xx responses), tracking performance metrics (LCP, CLS, INP) against a baseline, and ensuring critical content elements remain intact. It operates in a continuous loop for a defined watch window or until manually stopped, making it ideal for verifying fixes, continuous monitoring during launch phases, and safeguarding against unintended side effects from deploys, risky merges, or dependency upgrades. This proactive approach helps AI agents quickly identify and alert on issues before they impact end-users, ensuring a stable and high-quality user experience.
Best use case
Proactive detection of regressions in web applications or services after deployment, code merges, or infrastructure changes to ensure stability and performance.
Use this skill to monitor a deployed URL for regressions after deploys, merges, or dependency upgrades.
A detailed monitoring report indicating the health of the deployed URL. The report will highlight any detected regressions such as non-200 HTTP status codes, newly introduced console errors, network request failures, significant performance degradations (LCP, CLS, INP) compared to a baseline, or missing/altered critical content elements, enabling immediate action.
Practical example
Example input
To start monitoring `https://your-production-app.com` for 60 minutes:
`use_skill('canary-watch', url='https://your-production-app.com', duration_minutes=60)`
To compare against a baseline URL:
`use_skill('canary-watch', url='https://your-production-app.com', duration_minutes=30, baseline_url='https://your-staging-app.com')`Example output
```json
{
"status": "monitoring_complete",
"url": "https://www.example.com",
"duration_minutes": 60,
"regressions_found": false,
"summary": "No regressions detected across HTTP status, console errors, network failures, performance, or content.",
"details": {
"http_status": "200 OK (stable)",
"console_errors": "No new errors detected.",
"network_failures": "0 failures.",
"performance": {
"LCP_change": "+0ms (stable)",
"CLS_change": "+0 (stable)",
"INP_change": "+0ms (stable)"
},
"content_check": "Key elements present and stable."
}
}
```
Or, if regressions are found:
```json
{
"status": "monitoring_complete_with_issues",
"url": "https://www.example.com",
"duration_minutes": 60,
"regressions_found": true,
"summary": "Regressions detected in console errors and performance.",
"details": {
"http_status": "200 OK (stable)",
"console_errors": "New error detected: 'Uncaught ReferenceError: foo is not defined' on line 12.",
"network_failures": "1 failure: GET /api/data returned 500 Internal Server Error.",
"performance": {
"LCP_change": "+1500ms (significant degradation)",
"CLS_change": "+0.15 (degradation)",
"INP_change": "+50ms (stable)"
},
"content_check": "Missing element: 'Product Price' with ID 'product-price'."
}
}
```When to use this skill
- After deploying to production or staging environments.
- Following the merge of a high-risk Pull Request (PR).
- To verify that a specific bug fix has effectively resolved the issue.
- For continuous monitoring during a product launch window.
When not to use this skill
- For non-web-based applications or services that don't expose a monitored URL.
- During very early development stages where a stable deployed URL isn't available or expected.
- When comprehensive, dedicated external monitoring systems are already in place and fully integrated.
- If the primary goal is functional testing rather than regression monitoring of existing metrics.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/canary-watch/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How canary-watch Compares
| Feature / Agent | canary-watch | Standard Approach |
|---|---|---|
| Platform Support | Claude | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | easy | N/A |
Frequently Asked Questions
What does this skill do?
Use this skill to monitor a deployed URL for regressions after deploys, merges, or dependency upgrades.
Which AI agents support this skill?
This skill is designed for Claude.
How difficult is it to install?
The installation complexity is rated as easy. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
ChatGPT vs Claude for Agent Skills
Compare ChatGPT and Claude for AI agent skills across coding, writing, research, and reusable workflow execution.
SKILL.md Source
# Canary Watch — Post-Deploy Monitoring ## When to Use - After deploying to production or staging - After merging a risky PR - When you want to verify a fix actually fixed it - Continuous monitoring during a launch window - After dependency upgrades ## How It Works Monitors a deployed URL for regressions. Runs in a loop until stopped or until the watch window expires. ### What It Watches ``` 1. HTTP Status — is the page returning 200? 2. Console Errors — new errors that weren't there before? 3. Network Failures — failed API calls, 5xx responses? 4. Performance — LCP/CLS/INP regression vs baseline? 5. Content — did key elements disappear? (h1, nav, footer, CTA) 6. API Health — are critical endpoints responding within SLA? ``` ### Watch Modes **Quick check** (default): single pass, report results ``` /canary-watch https://myapp.com ``` **Sustained watch**: check every N minutes for M hours ``` /canary-watch https://myapp.com --interval 5m --duration 2h ``` **Diff mode**: compare staging vs production ``` /canary-watch --compare https://staging.myapp.com https://myapp.com ``` ### Alert Thresholds ```yaml critical: # immediate alert - HTTP status != 200 - Console error count > 5 (new errors only) - LCP > 4s - API endpoint returns 5xx warning: # flag in report - LCP increased > 500ms from baseline - CLS > 0.1 - New console warnings - Response time > 2x baseline info: # log only - Minor performance variance - New network requests (third-party scripts added?) ``` ### Notifications When a critical threshold is crossed: - Desktop notification (macOS/Linux) - Optional: Slack/Discord webhook - Log to `~/.claude/canary-watch.log` ## Output ```markdown ## Canary Report — myapp.com — 2026-03-23 03:15 PST ### Status: HEALTHY ✓ | Check | Result | Baseline | Delta | |-------|--------|----------|-------| | HTTP | 200 ✓ | 200 | — | | Console errors | 0 ✓ | 0 | — | | LCP | 1.8s ✓ | 1.6s | +200ms | | CLS | 0.01 ✓ | 0.01 | — | | API /health | 145ms ✓ | 120ms | +25ms | ### No regressions detected. Deploy is clean. ``` ## Integration Pair with: - `/browser-qa` for pre-deploy verification - Hooks: add as a PostToolUse hook on `git push` to auto-check after deploys - CI: run in GitHub Actions after deploy step
Related Skills
workspace-surface-audit
Audit the active repo, MCP servers, plugins, connectors, env surfaces, and harness setup, then recommend the highest-value ECC-native skills, hooks, agents, and operator workflows. Use when the user wants help setting up Claude Code or understanding what capabilities are actually available in their environment.
safety-guard
Use this skill to prevent destructive operations when working on production systems or running agents autonomously.
repo-scan
Cross-stack source code asset audit — classifies every file, detects embedded third-party libraries, and delivers actionable four-level verdicts per module with interactive HTML reports.
project-flow-ops
Operate execution flow across GitHub and Linear by triaging issues and pull requests, linking active work, and keeping GitHub public-facing while Linear remains the internal execution layer. Use when the user wants backlog control, PR triage, or GitHub-to-Linear coordination.
manim-video
Build reusable Manim explainers for technical concepts, graphs, system diagrams, and product walkthroughs, then hand off to the wider ECC video stack if needed. Use when the user wants a clean animated explainer rather than a generic talking-head script.
laravel-plugin-discovery
Discover and evaluate Laravel packages via LaraPlugins.io MCP. Use when the user wants to find plugins, check package health, or assess Laravel/PHP compatibility.
design-system
Use this skill to generate or audit design systems, check visual consistency, and review PRs that touch styling.
click-path-audit
Trace every user-facing button/touchpoint through its full state change sequence to find bugs where functions individually work but cancel each other out, produce wrong final state, or leave the UI in an inconsistent state. Use when: systematic debugging found no bugs but users report broken buttons, or after any major refactor touching shared state stores.
ck
Persistent per-project memory for Claude Code. Auto-loads project context on session start, tracks sessions with git activity, and writes to native memory. Commands run deterministic Node.js scripts — behavior is consistent across model versions.
benchmark
Use this skill to measure performance baselines, detect regressions before/after PRs, and compare stack alternatives.
swiftui-patterns
SwiftUI 架构模式,使用 @Observable 进行状态管理,视图组合,导航,性能优化,以及现代 iOS/macOS UI 最佳实践。
swift-protocol-di-testing
基于协议的依赖注入,用于可测试的Swift代码——使用聚焦协议和Swift Testing模拟文件系统、网络和外部API。