coreweave-sdk-patterns

Production-ready patterns for CoreWeave GPU workload management with kubectl and Python. Use when building inference clients, managing GPU deployments programmatically, or creating reusable CoreWeave deployment templates. Trigger with phrases like "coreweave patterns", "coreweave client", "coreweave Python", "coreweave deployment template".

25 stars

Best use case

coreweave-sdk-patterns is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Production-ready patterns for CoreWeave GPU workload management with kubectl and Python. Use when building inference clients, managing GPU deployments programmatically, or creating reusable CoreWeave deployment templates. Trigger with phrases like "coreweave patterns", "coreweave client", "coreweave Python", "coreweave deployment template".

Teams using coreweave-sdk-patterns should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/coreweave-sdk-patterns/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/jeremylongshore/claude-code-plugins-plus-skills/coreweave-sdk-patterns/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/coreweave-sdk-patterns/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How coreweave-sdk-patterns Compares

Feature / Agentcoreweave-sdk-patternsStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Production-ready patterns for CoreWeave GPU workload management with kubectl and Python. Use when building inference clients, managing GPU deployments programmatically, or creating reusable CoreWeave deployment templates. Trigger with phrases like "coreweave patterns", "coreweave client", "coreweave Python", "coreweave deployment template".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# CoreWeave SDK Patterns

## Overview

CoreWeave is Kubernetes-native -- use kubectl, Kubernetes Python client, or Helm for programmatic management. These patterns cover GPU-aware deployment templates, inference client wrappers, and node affinity configurations.

## Instructions

### GPU Affinity Helper

```python
# coreweave_helpers.py
from dataclasses import dataclass

@dataclass
class GPUConfig:
    gpu_class: str        # A100_PCIE_80GB, H100_SXM5, L40, etc.
    gpu_count: int = 1
    memory_gb: int = 32
    cpu_cores: int = 4

GPU_CATALOG = {
    "a100-80gb": GPUConfig("A100_PCIE_80GB", memory_gb=48, cpu_cores=8),
    "h100-80gb": GPUConfig("H100_SXM5", memory_gb=64, cpu_cores=12),
    "l40":       GPUConfig("L40", memory_gb=24, cpu_cores=4),
    "a100-8x":   GPUConfig("A100_NVLINK_A100_SXM4_80GB", gpu_count=8, memory_gb=256, cpu_cores=64),
}

def gpu_affinity_block(gpu_class: str) -> dict:
    return {
        "nodeAffinity": {
            "requiredDuringSchedulingIgnoredDuringExecution": {
                "nodeSelectorTerms": [{
                    "matchExpressions": [{
                        "key": "gpu.nvidia.com/class",
                        "operator": "In",
                        "values": [gpu_class],
                    }]
                }]
            }
        }
    }

def gpu_resources(config: GPUConfig) -> dict:
    return {
        "limits": {
            "nvidia.com/gpu": str(config.gpu_count),
            "memory": f"{config.memory_gb}Gi",
            "cpu": str(config.cpu_cores),
        },
        "requests": {
            "nvidia.com/gpu": str(config.gpu_count),
            "memory": f"{config.memory_gb // 2}Gi",
            "cpu": str(config.cpu_cores // 2),
        },
    }
```

### Inference Client Wrapper

```python
# inference_client.py
import requests
from typing import Optional

class CoreWeaveInferenceClient:
    def __init__(self, endpoint: str, timeout: int = 30):
        self.endpoint = endpoint.rstrip("/")
        self.timeout = timeout
        self.session = requests.Session()

    def generate(self, prompt: str, max_tokens: int = 256, **kwargs) -> str:
        resp = self.session.post(
            f"{self.endpoint}/v1/completions",
            json={"prompt": prompt, "max_tokens": max_tokens, **kwargs},
            timeout=self.timeout,
        )
        resp.raise_for_status()
        return resp.json()["choices"][0]["text"]

    def chat(self, messages: list[dict], **kwargs) -> str:
        resp = self.session.post(
            f"{self.endpoint}/v1/chat/completions",
            json={"messages": messages, **kwargs},
            timeout=self.timeout,
        )
        resp.raise_for_status()
        return resp.json()["choices"][0]["message"]["content"]

    def health(self) -> bool:
        try:
            resp = self.session.get(f"{self.endpoint}/health", timeout=5)
            return resp.status_code == 200
        except Exception:
            return False
```

### Deployment Template Generator

```python
import yaml

def generate_inference_deployment(
    name: str,
    image: str,
    gpu_type: str = "a100-80gb",
    replicas: int = 1,
    port: int = 8000,
) -> str:
    config = GPU_CATALOG[gpu_type]
    return yaml.dump({
        "apiVersion": "apps/v1",
        "kind": "Deployment",
        "metadata": {"name": name},
        "spec": {
            "replicas": replicas,
            "selector": {"matchLabels": {"app": name}},
            "template": {
                "metadata": {"labels": {"app": name}},
                "spec": {
                    "containers": [{
                        "name": name,
                        "image": image,
                        "ports": [{"containerPort": port}],
                        "resources": gpu_resources(config),
                    }],
                    "affinity": gpu_affinity_block(config.gpu_class),
                },
            },
        },
    })
```

## Error Handling

| Error | Cause | Solution |
|-------|-------|----------|
| GPU class not found | Typo in node label | Use exact values from `gpu.nvidia.com/class` |
| OOM on inference | Model too large for GPU | Use larger GPU or quantized model |
| Connection refused | Service not ready | Check pod readiness probe |

## Resources

- [CoreWeave GPU Instances](https://docs.coreweave.com/docs/platform/instances/gpu-instances)
- [Kubernetes Python Client](https://github.com/kubernetes-client/python)

## Next Steps

Apply patterns in `coreweave-core-workflow-a` for KServe inference deployments.

Related Skills

exa-sdk-patterns

25
from ComeOnOliver/skillshub

Apply production-ready exa-js SDK patterns with type safety, singletons, and wrappers. Use when implementing Exa integrations, refactoring SDK usage, or establishing team coding standards for Exa. Trigger with phrases like "exa SDK patterns", "exa best practices", "exa code patterns", "idiomatic exa", "exa wrapper".

exa-reliability-patterns

25
from ComeOnOliver/skillshub

Implement Exa reliability patterns: query fallback chains, circuit breakers, and graceful degradation. Use when building fault-tolerant Exa integrations, implementing fallback strategies, or adding resilience to production search services. Trigger with phrases like "exa reliability", "exa circuit breaker", "exa fallback", "exa resilience", "exa graceful degradation".

evernote-sdk-patterns

25
from ComeOnOliver/skillshub

Advanced Evernote SDK patterns and best practices. Use when implementing complex note operations, batch processing, search queries, or optimizing SDK usage. Trigger with phrases like "evernote sdk patterns", "evernote best practices", "evernote advanced", "evernote batch operations".

elevenlabs-sdk-patterns

25
from ComeOnOliver/skillshub

Apply production-ready ElevenLabs SDK patterns for TypeScript and Python. Use when implementing ElevenLabs integrations, refactoring SDK usage, or establishing team coding standards for audio AI applications. Trigger: "elevenlabs SDK patterns", "elevenlabs best practices", "elevenlabs code patterns", "idiomatic elevenlabs", "elevenlabs typescript".

documenso-sdk-patterns

25
from ComeOnOliver/skillshub

Apply production-ready Documenso SDK patterns for TypeScript and Python. Use when implementing Documenso integrations, refactoring SDK usage, or establishing team coding standards for Documenso. Trigger with phrases like "documenso SDK patterns", "documenso best practices", "documenso code patterns", "idiomatic documenso".

deepgram-sdk-patterns

25
from ComeOnOliver/skillshub

Apply production-ready Deepgram SDK patterns for TypeScript and Python. Use when implementing Deepgram integrations, refactoring SDK usage, or establishing team coding standards for Deepgram. Trigger: "deepgram SDK patterns", "deepgram best practices", "deepgram code patterns", "idiomatic deepgram", "deepgram typescript".

databricks-sdk-patterns

25
from ComeOnOliver/skillshub

Apply production-ready Databricks SDK patterns for Python and REST API. Use when implementing Databricks integrations, refactoring SDK usage, or establishing team coding standards for Databricks. Trigger with phrases like "databricks SDK patterns", "databricks best practices", "databricks code patterns", "idiomatic databricks".

customerio-sdk-patterns

25
from ComeOnOliver/skillshub

Apply production-ready Customer.io SDK patterns. Use when implementing typed clients, retry logic, event batching, or singleton management for customerio-node. Trigger: "customer.io best practices", "customer.io patterns", "production customer.io", "customer.io architecture", "customer.io singleton".

customerio-reliability-patterns

25
from ComeOnOliver/skillshub

Implement Customer.io reliability and fault-tolerance patterns. Use when building circuit breakers, fallback queues, idempotency, or graceful degradation for Customer.io integrations. Trigger: "customer.io reliability", "customer.io resilience", "customer.io circuit breaker", "customer.io fault tolerance".

coreweave-webhooks-events

25
from ComeOnOliver/skillshub

Monitor CoreWeave cluster events and GPU workload status. Use when tracking pod lifecycle events, monitoring GPU utilization, or alerting on inference service health changes. Trigger with phrases like "coreweave events", "coreweave monitoring", "coreweave pod alerts", "coreweave gpu monitoring".

coreweave-upgrade-migration

25
from ComeOnOliver/skillshub

Upgrade CoreWeave deployments and migrate between GPU types. Use when migrating from A100 to H100, upgrading CUDA versions, or updating inference server versions. Trigger with phrases like "upgrade coreweave", "coreweave gpu migration", "coreweave cuda upgrade", "migrate coreweave".

coreweave-security-basics

25
from ComeOnOliver/skillshub

Secure CoreWeave deployments with RBAC, network policies, and secrets management. Use when hardening GPU workloads, managing model access, or configuring namespace isolation. Trigger with phrases like "coreweave security", "coreweave rbac", "secure coreweave", "coreweave secrets".