aks-deployment

Deploying and debugging Toygres on AKS (Azure Kubernetes Service). Use when deploying, debugging pods, viewing logs, troubleshooting SSL, or managing Kubernetes resources.

16 stars

Best use case

aks-deployment is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Deploying and debugging Toygres on AKS (Azure Kubernetes Service). Use when deploying, debugging pods, viewing logs, troubleshooting SSL, or managing Kubernetes resources.

Teams using aks-deployment should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/aks-deployment/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/devops/aks-deployment/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/aks-deployment/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How aks-deployment Compares

Feature / Agentaks-deploymentStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Deploying and debugging Toygres on AKS (Azure Kubernetes Service). Use when deploying, debugging pods, viewing logs, troubleshooting SSL, or managing Kubernetes resources.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# AKS Deployment & Debugging

## Deployment

```bash
# Full deploy with HTTPS
./deploy/deploy-to-aks.sh --https

# Just restart to pick up new images
kubectl rollout restart deployment/toygres-server -n toygres-system
kubectl rollout status deployment/toygres-server -n toygres-system
```

## Viewing Logs

```bash
# Server logs
kubectl logs -n toygres-system -l app.kubernetes.io/component=server -f

# UI logs
kubectl logs -n toygres-system -l app.kubernetes.io/component=ui -f

# Previous crashed pod
kubectl logs -n toygres-system <pod-name> --previous
```

## Pod Management

```bash
# List pods
kubectl get pods -n toygres-system

# Describe pod (see events, errors)
kubectl describe pod <pod-name> -n toygres-system

# Exec into pod
kubectl exec -it <pod-name> -n toygres-system -- /bin/sh

# Delete pod (will restart)
kubectl delete pod <pod-name> -n toygres-system
```

## Common Issues

### Pod CrashLoopBackOff
```bash
# Check logs for crash reason
kubectl logs <pod-name> -n toygres-system --previous

# Common causes:
# - DATABASE_URL not set or wrong
# - Missing secrets
# - Port already in use
```

### Image Not Updating
```bash
# Force pull latest image
kubectl rollout restart deployment/toygres-server -n toygres-system

# Or delete pod directly
kubectl delete pod -n toygres-system -l app.kubernetes.io/component=server
```

### SSL Certificate Issues
```bash
# Check cert-manager
kubectl get certificate -n toygres-system
kubectl describe certificate toygres-tls -n toygres-system

# Check ingress
kubectl get ingress -n toygres-system
kubectl describe ingress toygres-ingress -n toygres-system
```

### Azure Workload Identity / azcopy 403 Errors

If `azcopy login --identity` succeeds but operations fail with 403 AuthorizationPermissionMismatch:

**Root cause:** `azcopy --identity` uses VM-based managed identity (IMDS), not AKS workload identity.

**Fix:** Use `--login-type=workload` explicitly:
```bash
# Wrong (uses IMDS, fails on AKS)
azcopy login --identity

# Correct (uses federated token)
azcopy login --login-type=workload
```

**Debug workload identity:**
```bash
# Check env vars are injected
kubectl exec <pod> -- env | grep AZURE_

# Should see:
# AZURE_CLIENT_ID=...
# AZURE_TENANT_ID=...
# AZURE_FEDERATED_TOKEN_FILE=/var/run/secrets/azure/tokens/azure-identity-token

# Test with az cli (uses federated token correctly)
az login --federated-token "$(cat $AZURE_FEDERATED_TOKEN_FILE)" \
  --service-principal -u $AZURE_CLIENT_ID -t $AZURE_TENANT_ID
az storage blob list --account-name <acct> --container-name <container> --auth-mode login
```

### Azure LoadBalancer DNS Propagation

**Problem:** Instance provisioning fails at test_connection even though service is created.

**Root cause:** Azure DNS propagation for LoadBalancer services takes 60-90+ seconds after IP is assigned.

**Timeline:**
1. LoadBalancer created → IP assigned (10-30s)
2. DNS record created → DNS propagates (30-60+ additional seconds)
3. Total wait time can be 60-90+ seconds

**Fix:** Use 120s timeout for connection tests, not 60s:

```rust
// In orchestrations
RetryPolicy::new(5)
    .with_timeout(Duration::from_secs(120)) // Not 60s!
```

**Debug DNS propagation:**
```bash
# Check if service has external IP
kubectl get svc -n toygres-managed <svc-name>

# Test DNS resolution
nslookup <dns-label>.westus2.cloudapp.azure.com

# Watch for IP assignment
kubectl get svc -n toygres-managed -w
```

## Local Testing Before Deploy

```bash
# Pause AKS server
kubectl scale deployment toygres-server -n toygres-system --replicas=0

# Run locally
./scripts/start-control-plane.sh

# Test at http://localhost:3000

# Resume AKS
kubectl scale deployment toygres-server -n toygres-system --replicas=1
```

Related Skills

arc-terraform-deployment

16
from diegosouzapw/awesome-omni-skill

Deploy ARC (Actions Runner Controller) infrastructure using Terraform on Rackspace Spot. Handles CRD registration, ArgoCD installation, and namespace management. Use when deploying or troubleshooting ARC infrastructure.

app-store-deployment

16
from diegosouzapw/awesome-omni-skill

Publishes mobile applications to iOS App Store and Google Play with code signing, versioning, and CI/CD automation. Use when preparing app releases, configuring signing certificates, or setting up automated deployment pipelines.

ansible-deployment

16
from diegosouzapw/awesome-omni-skill

Automates server configuration and multi-server deployments. Use when writing Ansible playbooks, setting up SSH auth, or checking deployment diffs.

aks-deployment-troubleshooter

16
from diegosouzapw/awesome-omni-skill

Diagnose and fix Kubernetes deployment failures, especially ImagePullBackOff, CrashLoopBackOff, and architecture mismatches. Battle-tested from 4-hour AKS debugging session with 10+ failure modes resolved.

agentuity-cli-cloud-machine-deployments

16
from diegosouzapw/awesome-omni-skill

List deployments running on a specific organization managed machine. Requires authentication. Use for Agentuity cloud platform operations

agentuity-cli-cloud-deployment-undeploy

16
from diegosouzapw/awesome-omni-skill

Undeploy the latest deployment. Requires authentication. Use for Agentuity cloud platform operations

agentuity-cli-cloud-deployment-show

16
from diegosouzapw/awesome-omni-skill

Show details about a specific deployment. Requires authentication. Use for Agentuity cloud platform operations

agentuity-cli-cloud-deployment-rollback

16
from diegosouzapw/awesome-omni-skill

Rollback the latest to the previous deployment. Requires authentication. Use for Agentuity cloud platform operations

agentuity-cli-cloud-deployment-remove

16
from diegosouzapw/awesome-omni-skill

Remove a specific deployment. Requires authentication. Use for Agentuity cloud platform operations

agentuity-cli-cloud-deployment-logs

16
from diegosouzapw/awesome-omni-skill

View logs for a specific deployment. Requires authentication. Use for Agentuity cloud platform operations

agentuity-cli-cloud-deployment-list

16
from diegosouzapw/awesome-omni-skill

List deployments. Requires authentication. Use for Agentuity cloud platform operations

agent-deployment-pipeline

16
from diegosouzapw/awesome-omni-skill

Implement CI/CD pipelines for AI agent deployment with evaluation gates. Use for GitHub Actions workflows, GitOps with ArgoCD, container image building, and automated testing. Triggers on "CI/CD", "pipeline", "GitHub Actions", "GitOps", "ArgoCD", "deployment automation", "continuous deployment", or when implementing safe agent release workflows.