cloud-native-engineer

The definitive skill for building and deploying high-performance, distributed systems using Cloud Native standards (Dapr, Redis, Microservices). Use when a project requires professional-grade architecture, cross-service communication, elastic scaling, and sub-second agentic latency. Mandatory for flawless deployments on Kubernetes (Local or Cloud).

16 stars

Best use case

cloud-native-engineer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

The definitive skill for building and deploying high-performance, distributed systems using Cloud Native standards (Dapr, Redis, Microservices). Use when a project requires professional-grade architecture, cross-service communication, elastic scaling, and sub-second agentic latency. Mandatory for flawless deployments on Kubernetes (Local or Cloud).

Teams using cloud-native-engineer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/cloud-native-engineer/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/devops/cloud-native-engineer/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/cloud-native-engineer/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How cloud-native-engineer Compares

Feature / Agentcloud-native-engineerStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

The definitive skill for building and deploying high-performance, distributed systems using Cloud Native standards (Dapr, Redis, Microservices). Use when a project requires professional-grade architecture, cross-service communication, elastic scaling, and sub-second agentic latency. Mandatory for flawless deployments on Kubernetes (Local or Cloud).

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Cloud Native Engineer (Master Skill)

This skill transforms Claude into an elite Cloud Native Architect capable of delivering production-ready distributed systems.

## Core Capabilities

1. **Strategic Domain Decomposition**: Logic for splitting any monolith into clear microservice boundaries.
2. **Standardized Dapr Infrastructure**: Reliable, ready-to-use configurations for Pub/Sub, State, and Jobs.
3. **Flawless K8s Orchestration**: Deterministic deployment workflows that avoid DNS, Auth, and Probe failures.
4. **Agentic Performance (Sub-Second)**: Engineering patterns for ultra-fast AI interactions.

## Workflows and Procedures

### 1. Architectural Planning
Analyze service boundaries and define the shared communication backbone.
- **Guidance**: [MICROSERVICES.md](references/microservices.md) (Domain splitting & Payload standards).

### 2. Infrastructure & Containerization
Build elite, secure, and lean images before rolling out the backbone.
- **Docker Best Practices**: [DOCKER_BEST_PRACTICES.md](references/docker_best_practices.md).
- **Backbone Setup**: [DAPR_CONFIG.md](references/dapr_config.md) (Redis-first Pub/Sub and State yaml).

### 3. Fail-Proof Deployment (Mandatory Sequence)
Follow the EXACT order of operations to ensure 100% success.
- **Deployment Guide**: [DEPLOYMENT_GUIDE.md](references/deployment_guide.md).
- **Manifest Standards**: [MANIFEST_STANDARDS.md](references/manifest_standards.md).

### 4. Automated Orchestration
Use the "Low Freedom" scripts to automate repetitive tasks.
- **Deploy Minikube**: [scripts/deploy_minikube.ps1](../scripts/deploy_minikube.ps1).
- **Verify Cluster**: [scripts/verify_cluster.ps1](../scripts/verify_cluster.ps1).

### 5. Diagnostic Excellence
Diagnose "Silent Failures" using the troubleshooting matrix.
- **Matrix**: [TROUBLESHOOTING.md](references/troubleshooting.md).

### 6. AI Acceleration (SaaS Specialty)
Implement the persistent MCP pattern for sub-second chatbot responses.
- **Reference**: [LATENCY_OPTIMIZATION.md](references/latency_optimization.md).

## Elite Checklist for Agents
- [ ] **Network-Parity**: Internal calls use K8s service names (`http://service:port`).
- [ ] **Probe-Resilience**: Liveness probes have enough delay for sidecar startup.
- [ ] **Cluster-Auth**: `JWKS_URL` points to the internal ingress/service.
- [ ] **Warm-Start**: AI Tools are pre-initialized in the application lifespan.

## AI Toolbox & Standards

To ensure maximum "intelligence" and deterministic outcomes, use the following tools and patterns:

### 1. Diagnostic Tools
| Tool | Description | Parameters | When to Use |
| :--- | :--- | :--- | :--- |
| `kubectl logs` | Retrieve backend/frontend logs. | `-n <namespace> -c <container> --tail=<N>` | For 500 errors or startup failures. |
| `kubectl exec` | Run commands inside a pod. | `-n <namespace> -c <container> -- <command>` | For database checks, file verification. |
| `kubectl describe` | Detailed status of a resource. | `pod <name> -n <namespace>` | For `CrashLoopBackOff` or pending pods. |

### 2. Implementation Patterns
- **Cascade Deletes**: Always use `sa_column_kwargs={"ondelete": "CASCADE"}` for foreign keys referencing parent entities (e.g., `task_id` in `TaskTag`).
- **Path Resilience**: Use absolute paths (`/app/...`) inside containers to avoid "No such file" errors.
- **Dapr-First**: Use `event_publisher` for cross-service events to maintain decoupled architecture.
- **SQLModel Standards**: Use `Session(engine)` context managers for reliable database transactions and commits.

### 3. Verification Protocol
1. **Logs**: Check backend logs for tracebacks.
2. **Reproduction**: Create a minimal python script inside the pod to isolate DB/Logic errors.
3. **Fix & Verify**: Apply the fix, re-run the reproduction script, and verify via frontend/CLI.

Related Skills

network-engineer

16
from diegosouzapw/awesome-omni-skill

Expert network engineer specializing in modern cloud networking, security architectures, and performance optimization.

multi-cloud-architecture

16
from diegosouzapw/awesome-omni-skill

Design multi-cloud architectures using a decision framework to select and integrate services across AWS, Azure, and GCP. Use when building multi-cloud systems, avoiding vendor lock-in, or leveragin...

mlops-engineer

16
from diegosouzapw/awesome-omni-skill

ML infrastructure automation and production ML lifecycle management. Use when building ML pipelines, setting up experiment tracking, implementing CI/CD for models, or managing model deployments.

jumpcloud-automation

16
from diegosouzapw/awesome-omni-skill

Automate Jumpcloud tasks via Rube MCP (Composio). Always search tools first for current schemas.

iot-engineer

16
from diegosouzapw/awesome-omni-skill

Expert IoT engineer specializing in connected device architectures, edge computing, and IoT platform development. Masters IoT protocols, device management, and data pipelines with focus on building scalable, secure, and reliable IoT solutions.

icims-talent-cloud-automation

16
from diegosouzapw/awesome-omni-skill

Automate Icims Talent Cloud tasks via Rube MCP (Composio). Always search tools first for current schemas.

hybrid-cloud-networking

16
from diegosouzapw/awesome-omni-skill

Configure secure, high-performance connectivity between on-premises infrastructure and cloud platforms using VPN and dedicated connections. Use when building hybrid cloud architectures, connecting ...

hybrid-cloud-architect

16
from diegosouzapw/awesome-omni-skill

Expert hybrid cloud architect specializing in complex multi-cloud solutions across AWS/Azure/GCP and private clouds (OpenStack/VMware).

google-cloud-vision-automation

16
from diegosouzapw/awesome-omni-skill

Automate Google Cloud Vision tasks via Rube MCP (Composio). Always search tools first for current schemas.

gcp-cloud

16
from diegosouzapw/awesome-omni-skill

Google Cloud Platform infrastructure patterns and best practices. Use when designing or implementing GCP solutions including Compute Engine, Cloud Functions, Cloud Storage, and BigQuery.

gcp-cloud-run

16
from diegosouzapw/awesome-omni-skill

Specialized skill for building production-ready serverless applications on GCP. Covers Cloud Run services (containerized), Cloud Run Functions (event-driven), cold start optimization, and event-dri...

faion-cicd-engineer

16
from diegosouzapw/awesome-omni-skill

CI/CD: GitHub Actions, GitLab CI, Jenkins, ArgoCD, GitOps, monitoring.