aks-deployment
Deploying and debugging Toygres on AKS (Azure Kubernetes Service). Use when deploying, debugging pods, viewing logs, troubleshooting SSL, or managing Kubernetes resources.
Best use case
aks-deployment is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Deploying and debugging Toygres on AKS (Azure Kubernetes Service). Use when deploying, debugging pods, viewing logs, troubleshooting SSL, or managing Kubernetes resources.
Teams using aks-deployment should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/aks-deployment/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How aks-deployment Compares
| Feature / Agent | aks-deployment | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Deploying and debugging Toygres on AKS (Azure Kubernetes Service). Use when deploying, debugging pods, viewing logs, troubleshooting SSL, or managing Kubernetes resources.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# AKS Deployment & Debugging
## Deployment
```bash
# Full deploy with HTTPS
./deploy/deploy-to-aks.sh --https
# Just restart to pick up new images
kubectl rollout restart deployment/toygres-server -n toygres-system
kubectl rollout status deployment/toygres-server -n toygres-system
```
## Viewing Logs
```bash
# Server logs
kubectl logs -n toygres-system -l app.kubernetes.io/component=server -f
# UI logs
kubectl logs -n toygres-system -l app.kubernetes.io/component=ui -f
# Previous crashed pod
kubectl logs -n toygres-system <pod-name> --previous
```
## Pod Management
```bash
# List pods
kubectl get pods -n toygres-system
# Describe pod (see events, errors)
kubectl describe pod <pod-name> -n toygres-system
# Exec into pod
kubectl exec -it <pod-name> -n toygres-system -- /bin/sh
# Delete pod (will restart)
kubectl delete pod <pod-name> -n toygres-system
```
## Common Issues
### Pod CrashLoopBackOff
```bash
# Check logs for crash reason
kubectl logs <pod-name> -n toygres-system --previous
# Common causes:
# - DATABASE_URL not set or wrong
# - Missing secrets
# - Port already in use
```
### Image Not Updating
```bash
# Force pull latest image
kubectl rollout restart deployment/toygres-server -n toygres-system
# Or delete pod directly
kubectl delete pod -n toygres-system -l app.kubernetes.io/component=server
```
### SSL Certificate Issues
```bash
# Check cert-manager
kubectl get certificate -n toygres-system
kubectl describe certificate toygres-tls -n toygres-system
# Check ingress
kubectl get ingress -n toygres-system
kubectl describe ingress toygres-ingress -n toygres-system
```
### Azure Workload Identity / azcopy 403 Errors
If `azcopy login --identity` succeeds but operations fail with 403 AuthorizationPermissionMismatch:
**Root cause:** `azcopy --identity` uses VM-based managed identity (IMDS), not AKS workload identity.
**Fix:** Use `--login-type=workload` explicitly:
```bash
# Wrong (uses IMDS, fails on AKS)
azcopy login --identity
# Correct (uses federated token)
azcopy login --login-type=workload
```
**Debug workload identity:**
```bash
# Check env vars are injected
kubectl exec <pod> -- env | grep AZURE_
# Should see:
# AZURE_CLIENT_ID=...
# AZURE_TENANT_ID=...
# AZURE_FEDERATED_TOKEN_FILE=/var/run/secrets/azure/tokens/azure-identity-token
# Test with az cli (uses federated token correctly)
az login --federated-token "$(cat $AZURE_FEDERATED_TOKEN_FILE)" \
--service-principal -u $AZURE_CLIENT_ID -t $AZURE_TENANT_ID
az storage blob list --account-name <acct> --container-name <container> --auth-mode login
```
### Azure LoadBalancer DNS Propagation
**Problem:** Instance provisioning fails at test_connection even though service is created.
**Root cause:** Azure DNS propagation for LoadBalancer services takes 60-90+ seconds after IP is assigned.
**Timeline:**
1. LoadBalancer created → IP assigned (10-30s)
2. DNS record created → DNS propagates (30-60+ additional seconds)
3. Total wait time can be 60-90+ seconds
**Fix:** Use 120s timeout for connection tests, not 60s:
```rust
// In orchestrations
RetryPolicy::new(5)
.with_timeout(Duration::from_secs(120)) // Not 60s!
```
**Debug DNS propagation:**
```bash
# Check if service has external IP
kubectl get svc -n toygres-managed <svc-name>
# Test DNS resolution
nslookup <dns-label>.westus2.cloudapp.azure.com
# Watch for IP assignment
kubectl get svc -n toygres-managed -w
```
## Local Testing Before Deploy
```bash
# Pause AKS server
kubectl scale deployment toygres-server -n toygres-system --replicas=0
# Run locally
./scripts/start-control-plane.sh
# Test at http://localhost:3000
# Resume AKS
kubectl scale deployment toygres-server -n toygres-system --replicas=1
```Related Skills
arc-terraform-deployment
Deploy ARC (Actions Runner Controller) infrastructure using Terraform on Rackspace Spot. Handles CRD registration, ArgoCD installation, and namespace management. Use when deploying or troubleshooting ARC infrastructure.
app-store-deployment
Publishes mobile applications to iOS App Store and Google Play with code signing, versioning, and CI/CD automation. Use when preparing app releases, configuring signing certificates, or setting up automated deployment pipelines.
ansible-deployment
Automates server configuration and multi-server deployments. Use when writing Ansible playbooks, setting up SSH auth, or checking deployment diffs.
aks-deployment-troubleshooter
Diagnose and fix Kubernetes deployment failures, especially ImagePullBackOff, CrashLoopBackOff, and architecture mismatches. Battle-tested from 4-hour AKS debugging session with 10+ failure modes resolved.
agentuity-cli-cloud-machine-deployments
List deployments running on a specific organization managed machine. Requires authentication. Use for Agentuity cloud platform operations
agentuity-cli-cloud-deployment-undeploy
Undeploy the latest deployment. Requires authentication. Use for Agentuity cloud platform operations
agentuity-cli-cloud-deployment-show
Show details about a specific deployment. Requires authentication. Use for Agentuity cloud platform operations
agentuity-cli-cloud-deployment-rollback
Rollback the latest to the previous deployment. Requires authentication. Use for Agentuity cloud platform operations
agentuity-cli-cloud-deployment-remove
Remove a specific deployment. Requires authentication. Use for Agentuity cloud platform operations
agentuity-cli-cloud-deployment-logs
View logs for a specific deployment. Requires authentication. Use for Agentuity cloud platform operations
agentuity-cli-cloud-deployment-list
List deployments. Requires authentication. Use for Agentuity cloud platform operations
agent-deployment-pipeline
Implement CI/CD pipelines for AI agent deployment with evaluation gates. Use for GitHub Actions workflows, GitOps with ArgoCD, container image building, and automated testing. Triggers on "CI/CD", "pipeline", "GitHub Actions", "GitOps", "ArgoCD", "deployment automation", "continuous deployment", or when implementing safe agent release workflows.