deployment-procedures

Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts.

31,392 stars
Complexity: easy

About this skill

The 'deployment-procedures' skill equips AI agents with a robust framework for navigating the complexities of production deployments. Instead of providing prescriptive scripts, it instills foundational principles of safe deployment workflows, effective rollback strategies, and crucial verification steps. This skill emphasizes teaching the 'why' behind deployment choices, enabling the agent to make informed, adaptable decisions tailored to unique project requirements and diverse technical stacks. It includes a decision tree for platform selection and encourages an understanding of underlying rationale, making the agent a valuable assistant for strategic deployment planning and problem-solving.

Best use case

An AI agent can utilize this skill when asked to advise on deployment strategies, evaluate existing deployment plans, suggest rollback mechanisms, or help select appropriate deployment platforms for a new application. It's particularly useful for generating high-level guidance or detailed frameworks for deploying software safely and efficiently.

Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts.

The AI agent will provide well-reasoned, principled advice on deployment strategies, covering aspects like platform selection, safe workflow design, and rollback planning. It will help users understand the implications of different deployment choices and guide them toward robust, production-ready solutions.

Practical example

Example input

I need to deploy a new microservice written in Python, using a serverless architecture. Can you help me think through a safe deployment strategy, including platform choice and rollback plans?

Example output

Certainly! Let's break down a safe deployment strategy for your Python microservice. 

**1. Platform Selection:** Given a serverless architecture, common choices include AWS Lambda, Google Cloud Functions, or Azure Functions. Considerations like existing cloud infrastructure, specific service integrations, and cost models will be key. 

**2. Safe Deployment Workflow:** We'd aim for a phased rollout. A common approach is canary deployments or blue/green deployments. 
   *   **Canary:** Slowly direct a small percentage of traffic to the new version, monitoring key metrics (errors, latency, resource usage) before fully shifting. 
   *   **Blue/Green:** Deploy the new version ('green') alongside the old ('blue'), test thoroughly, then instantly switch traffic. This offers immediate rollback by switching back to 'blue'.

**3. Rollback Strategy:** For serverless, rollbacks typically involve pointing traffic back to the previous version or redeploying the older function code. 
   *   Automated rollback triggers based on monitoring alerts (e.g., elevated error rates). 
   *   Pre-tested rollback procedures. 

**4. Verification:** Key verification steps include automated tests (unit, integration, end-to-end), monitoring of application health and performance metrics, and post-deployment sanity checks.

Which cloud provider are you leaning towards, and do you have specific metrics you'd prioritize monitoring?

When to use this skill

  • Use this skill when an AI agent needs to provide strategic guidance on deploying a new application, analyze risks in a current deployment process, recommend verification steps, or design a resilient rollback strategy. It's ideal for conceptual discussions and planning phases rather than direct execution.

When not to use this skill

  • Do not use this skill when the AI agent is expected to directly execute deployment commands, write specific infrastructure-as-code scripts (e.g., Terraform, Ansible playbooks) without further context, or debug live deployment failures at a code level. This skill focuses on principles and strategy, not low-level operational tasks.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/deployment-procedures/SKILL.md --create-dirs "https://raw.githubusercontent.com/sickn33/antigravity-awesome-skills/main/plugins/antigravity-awesome-skills-claude/skills/deployment-procedures/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/deployment-procedures/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How deployment-procedures Compares

Feature / Agentdeployment-proceduresStandard Approach
Platform SupportClaudeLimited / Varies
Context Awareness High Baseline
Installation ComplexityeasyN/A

Frequently Asked Questions

What does this skill do?

Production deployment principles and decision-making. Safe deployment workflows, rollback strategies, and verification. Teaches thinking, not scripts.

Which AI agents support this skill?

This skill is designed for Claude.

How difficult is it to install?

The installation complexity is rated as easy. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Deployment Procedures

> Deployment principles and decision-making for safe production releases.
> **Learn to THINK, not memorize scripts.**

---

## ⚠️ How to Use This Skill

This skill teaches **deployment principles**, not bash scripts to copy.

- Every deployment is unique
- Understand the WHY behind each step
- Adapt procedures to your platform

---

## 1. Platform Selection

### Decision Tree

```
What are you deploying?
│
├── Static site / JAMstack
│   └── Vercel, Netlify, Cloudflare Pages
│
├── Simple web app
│   ├── Managed → Railway, Render, Fly.io
│   └── Control → VPS + PM2/Docker
│
├── Microservices
│   └── Container orchestration
│
└── Serverless
    └── Edge functions, Lambda
```

### Each Platform Has Different Procedures

| Platform | Deployment Method |
|----------|------------------|
| **Vercel/Netlify** | Git push, auto-deploy |
| **Railway/Render** | Git push or CLI |
| **VPS + PM2** | SSH + manual steps |
| **Docker** | Image push + orchestration |
| **Kubernetes** | kubectl apply |

---

## 2. Pre-Deployment Principles

### The 4 Verification Categories

| Category | What to Check |
|----------|--------------|
| **Code Quality** | Tests passing, linting clean, reviewed |
| **Build** | Production build works, no warnings |
| **Environment** | Env vars set, secrets current |
| **Safety** | Backup done, rollback plan ready |

### Pre-Deployment Checklist

- [ ] All tests passing
- [ ] Code reviewed and approved
- [ ] Production build successful
- [ ] Environment variables verified
- [ ] Database migrations ready (if any)
- [ ] Rollback plan documented
- [ ] Team notified
- [ ] Monitoring ready

---

## 3. Deployment Workflow Principles

### The 5-Phase Process

```
1. PREPARE
   └── Verify code, build, env vars

2. BACKUP
   └── Save current state before changing

3. DEPLOY
   └── Execute with monitoring open

4. VERIFY
   └── Health check, logs, key flows

5. CONFIRM or ROLLBACK
   └── All good? Confirm. Issues? Rollback.
```

### Phase Principles

| Phase | Principle |
|-------|-----------|
| **Prepare** | Never deploy untested code |
| **Backup** | Can't rollback without backup |
| **Deploy** | Watch it happen, don't walk away |
| **Verify** | Trust but verify |
| **Confirm** | Have rollback trigger ready |

---

## 4. Post-Deployment Verification

### What to Verify

| Check | Why |
|-------|-----|
| **Health endpoint** | Service is running |
| **Error logs** | No new errors |
| **Key user flows** | Critical features work |
| **Performance** | Response times acceptable |

### Verification Window

- **First 5 minutes**: Active monitoring
- **15 minutes**: Confirm stable
- **1 hour**: Final verification
- **Next day**: Review metrics

---

## 5. Rollback Principles

### When to Rollback

| Symptom | Action |
|---------|--------|
| Service down | Rollback immediately |
| Critical errors | Rollback |
| Performance >50% degraded | Consider rollback |
| Minor issues | Fix forward if quick |

### Rollback Strategy by Platform

| Platform | Rollback Method |
|----------|----------------|
| **Vercel/Netlify** | Redeploy previous commit |
| **Railway/Render** | Rollback in dashboard |
| **VPS + PM2** | Restore backup, restart |
| **Docker** | Previous image tag |
| **K8s** | kubectl rollout undo |

### Rollback Principles

1. **Speed over perfection**: Rollback first, debug later
2. **Don't compound errors**: One rollback, not multiple changes
3. **Communicate**: Tell team what happened
4. **Post-mortem**: Understand why after stable

---

## 6. Zero-Downtime Deployment

### Strategies

| Strategy | How It Works |
|----------|--------------|
| **Rolling** | Replace instances one by one |
| **Blue-Green** | Switch traffic between environments |
| **Canary** | Gradual traffic shift |

### Selection Principles

| Scenario | Strategy |
|----------|----------|
| Standard release | Rolling |
| High-risk change | Blue-green (easy rollback) |
| Need validation | Canary (test with real traffic) |

---

## 7. Emergency Procedures

### Service Down Priority

1. **Assess**: What's the symptom?
2. **Quick fix**: Restart if unclear
3. **Rollback**: If restart doesn't help
4. **Investigate**: After stable

### Investigation Order

| Check | Common Issues |
|-------|--------------|
| **Logs** | Errors, exceptions |
| **Resources** | Disk full, memory |
| **Network** | DNS, firewall |
| **Dependencies** | Database, APIs |

---

## 8. Anti-Patterns

| ❌ Don't | ✅ Do |
|----------|-------|
| Deploy on Friday | Deploy early in week |
| Rush deployment | Follow the process |
| Skip staging | Always test first |
| Deploy without backup | Backup before deploy |
| Walk away after deploy | Monitor for 15+ min |
| Multiple changes at once | One change at a time |

---

## 9. Decision Checklist

Before deploying:

- [ ] **Platform-appropriate procedure?**
- [ ] **Backup strategy ready?**
- [ ] **Rollback plan documented?**
- [ ] **Monitoring configured?**
- [ ] **Team notified?**
- [ ] **Time to monitor after?**

---

## 10. Best Practices

1. **Small, frequent deploys** over big releases
2. **Feature flags** for risky changes
3. **Automate** repetitive steps
4. **Document** every deployment
5. **Review** what went wrong after issues
6. **Test rollback** before you need it

---

> **Remember:** Every deployment is a risk. Minimize risk through preparation, not speed.

## When to Use
This skill is applicable to execute the workflow or actions described in the overview.

Related Skills

deployment-pipeline-design

31392
from sickn33/antigravity-awesome-skills

Architecture patterns for multi-stage CI/CD pipelines with approval gates and deployment strategies.

DevOps & InfrastructureClaude

kubernetes-deployment

31355
from sickn33/antigravity-awesome-skills

Kubernetes deployment workflow for container orchestration, Helm charts, service mesh, and production-ready K8s configurations.

DevOps & InfrastructureClaude

linux-shell-scripting

31392
from sickn33/antigravity-awesome-skills

Provide production-ready shell script templates for common Linux system administration tasks including backups, monitoring, user management, log analysis, and automation. These scripts serve as building blocks for security operations and penetration testing environments.

DevOps & InfrastructureClaude

iterate-pr

31392
from sickn33/antigravity-awesome-skills

Iterate on a PR until CI passes. Use when you need to fix CI failures, address review feedback, or continuously push fixes until all checks are green. Automates the feedback-fix-push-wait cycle.

DevOps & InfrastructureClaude

istio-traffic-management

31392
from sickn33/antigravity-awesome-skills

Comprehensive guide to Istio traffic management for production service mesh deployments.

DevOps & InfrastructureClaude

incident-runbook-templates

31392
from sickn33/antigravity-awesome-skills

Production-ready templates for incident response runbooks covering detection, triage, mitigation, resolution, and communication.

DevOps & InfrastructureClaude

incident-response-smart-fix

31392
from sickn33/antigravity-awesome-skills

[Extended thinking: This workflow implements a sophisticated debugging and resolution pipeline that leverages AI-assisted debugging tools and observability platforms to systematically diagnose and res

DevOps & InfrastructureClaudeGitHub Copilot

incident-responder

31392
from sickn33/antigravity-awesome-skills

Expert SRE incident responder specializing in rapid problem resolution, modern observability, and comprehensive incident management.

DevOps & InfrastructureClaude

expo-cicd-workflows

31392
from sickn33/antigravity-awesome-skills

Helps understand and write EAS workflow YAML files for Expo projects. Use this skill when the user asks about CI/CD or workflows in an Expo or EAS context, mentions .eas/workflows/, or wants help with EAS build pipelines or deployment automation.

DevOps & InfrastructureClaude

error-diagnostics-error-trace

31392
from sickn33/antigravity-awesome-skills

You are an error tracking and observability expert specializing in implementing comprehensive error monitoring solutions. Set up error tracking systems, configure alerts, implement structured logging,

DevOps & InfrastructureClaude

error-debugging-error-trace

31392
from sickn33/antigravity-awesome-skills

You are an error tracking and observability expert specializing in implementing comprehensive error monitoring solutions. Set up error tracking systems, configure alerts, implement structured logging, and ensure teams can quickly identify and resolve production issues.

DevOps & InfrastructureClaude

error-debugging-error-analysis

31392
from sickn33/antigravity-awesome-skills

You are an expert error analysis specialist with deep expertise in debugging distributed systems, analyzing production incidents, and implementing comprehensive observability solutions.

DevOps & InfrastructureClaude