flux-troubleshooting
Use when Flux resources show Ready False, reconciliation errors appear in logs, deployments fail to sync from Git, HelmRelease installations fail, source artifacts are not being fetched, or image automation is not updating tags
Best use case
flux-troubleshooting is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Use when Flux resources show Ready False, reconciliation errors appear in logs, deployments fail to sync from Git, HelmRelease installations fail, source artifacts are not being fetched, or image automation is not updating tags
Teams using flux-troubleshooting should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/flux-troubleshooting/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How flux-troubleshooting Compares
| Feature / Agent | flux-troubleshooting | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Use when Flux resources show Ready False, reconciliation errors appear in logs, deployments fail to sync from Git, HelmRelease installations fail, source artifacts are not being fetched, or image automation is not updating tags
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Flux CD Troubleshooting
Diagnose Flux CD reconciliation failures and errors in GitOps environments.
## Keywords
flux, fluxcd, troubleshooting, debug, error, failed, failure, reconciliation, kustomization, helmrelease, source, gitrepository, ocirepository, artifact, health check, diagnose, resolve, fix
## When to Use This Skill
- Flux resources show `Ready: False` status
- Reconciliation errors appear in logs
- Deployments are not syncing from Git
- HelmRelease installations are failing
- Source artifacts are not being fetched
- Image automation is not updating tags
## Related Skills
- [flux-operations](../flux-operations) - Suspend/resume/reconcile operations
- [flux-gitops-patterns](../flux-gitops-patterns) - Architecture and patterns
- [k8s-platform-operations](../k8s-platform-operations) - Incident response procedures
## Quick Reference
| Task | Command |
|------|---------|
| Check Flux health | `flux check` |
| View all errors | `flux logs -A --level=error` |
| Get all status | `flux get all -A` |
| Warning events | `kubectl get events -n flux-system --field-selector type=Warning` |
## Diagnostic Workflow
```
Is the resource Ready?
├─ Yes → Check if correct version/revision deployed
└─ No → Is the source Ready?
├─ Yes → Check Kustomization/HelmRelease logs
│ ├─ dry-run failed → Fix YAML syntax in Git
│ ├─ health check timeout → Check pod logs/resources
│ └─ dependency not ready → Fix parent first
└─ No → Check source credentials/connectivity
├─ authentication failed → Update deploy key/secret
├─ checkout failed → Verify branch/tag exists
└─ artifact not found → Check repository URL
```
## Diagnostic Commands
### 1. Check Controller Health
```bash
flux check
```
### 2. View Error Logs
```bash
flux logs --all-namespaces --level=error
```
### 3. Get Warning Events
```bash
kubectl get events -n flux-system --field-selector type=Warning
```
### 4. Inspect Resource Status
```bash
flux get kustomizations -A
flux get helmreleases -A
flux get sources all -A
```
### 5. Controller-Specific Logs
```bash
kubectl logs -n flux-system deploy/source-controller
kubectl logs -n flux-system deploy/kustomize-controller
kubectl logs -n flux-system deploy/helm-controller
kubectl logs -n flux-system deploy/notification-controller
kubectl logs -n flux-system deploy/image-reflector-controller
kubectl logs -n flux-system deploy/image-automation-controller
```
## Error Pattern Reference
| Error Pattern | Cause | GitOps Solution |
|---------------|-------|-----------------|
| `failed to checkout` | Git authentication or network | Verify deploy keys/credentials |
| `dry-run failed: Invalid` | Invalid manifest YAML | Fix syntax in Git repository |
| `health check timeout` | Pods not becoming ready | Check resource limits, images |
| `dependency not ready` | Parent Kustomization failed | Fix upstream dependency first |
| `artifact not found` | Source not synced | Check source status, reconcile |
| `Unsupported value` | Invalid field value | Correct the value in Git |
| `UNAUTHORIZED` | Registry auth failed | Check imagePullSecrets |
| `MANIFEST_UNKNOWN` | OCI tag doesn't exist | Verify tag in registry |
## OCI Repository Troubleshooting
### Common OCI Issues
```bash
flux get sources oci -A
kubectl logs -n flux-system deploy/source-controller | grep -i oci
kubectl get secret -n flux-system flux-system -o jsonpath='{.data.\.dockerconfigjson}' | base64 -d
```
### OCI Error Patterns
| Error | Cause | Diagnostic |
|-------|-------|------------|
| `UNAUTHORIZED` | Invalid credentials | Check docker-registry secret exists and is not expired; recommend updating secret |
| `MANIFEST_UNKNOWN` | Tag not found | Verify tag exists in registry |
| `DENIED` | Permission denied | Check registry permissions for the service account |
| `NAME_UNKNOWN` | Repository not found | Verify repository path in OCIRepository spec |
## Image Automation Troubleshooting
### Check Image Policies
```bash
flux get images policy -A
flux get images repository -A
flux get images update -A
```
### Image Automation Errors
| Error | Cause | Diagnostic |
|-------|-------|------------|
| `no new images found` | Filter too restrictive | Check semver/regex pattern against available tags; recommend adjusting filter |
| `access denied` | Registry auth | Check image pull secrets configuration |
| `failed to push` | Git write access | Check if deploy key has write permission |
## Webhook/Notification Debugging
```bash
kubectl logs -n flux-system deploy/notification-controller
flux get alerts -A
flux get alert-providers -A
kubectl logs -n flux-system deploy/notification-controller | grep -i receiver
```
## Structured Log Analysis
```json
{
"level": "error",
"ts": "2024-01-15T09:36:41.286Z",
"controllerGroup": "kustomize.toolkit.fluxcd.io",
"controllerKind": "Kustomization",
"name": "resource-name",
"namespace": "namespace",
"msg": "Reconciliation failed after 2s, next try in 5m0s",
"revision": "main@sha1:abc123",
"error": "specific error message"
}
```
## GitOps Principles for Troubleshooting
1. **Never fix directly in cluster** - Identify root cause, fix in Git
2. **Suspend before debugging** - Prevent Flux from reverting test changes
3. **Check the full dependency chain** - Infrastructure before apps
4. **Verify source availability** - Git repos and registries must be accessible
5. **Review recent commits** - Issues often correlate with recent changes
## MCP Tools Available
When the Flux MCP server is connected:
- `mcp__flux-operator-mcp__get_flux_instance` - Get Flux installation status
- `mcp__flux-operator-mcp__get_kubernetes_logs` - Get pod logs
- `mcp__flux-operator-mcp__get_kubernetes_resources` - Query resources
- `mcp__flux-operator-mcp__search_flux_docs` - Search Flux documentationRelated Skills
linux-troubleshooting
Linux system troubleshooting workflow for diagnosing and resolving system issues, performance problems, and service failures.
kubernetes-troubleshooting
Debug Kubernetes pods, services, networking, and scaling issues. Use this skill when troubleshooting K8s deployments, investigating pod failures, or diagnosing cluster problems.
fluxcd
GitOps toolkit with Flux CD for Kubernetes continuous delivery. Use when implementing GitOps workflows, Helm releases, Kustomize deployments, or image automation. Triggers: fluxcd, flux, gitops, gitrepository, kustomization, helmrelease, image automation, source controller.
arc-runner-troubleshooting
Troubleshoot ARC (Actions Runner Controller) runners on Rackspace Spot Kubernetes. Diagnose stuck jobs, scaling issues, and cluster access. Activates on "runner", "ARC", "stuck job", "queued", "GitHub Actions", or "CI stuck".
troubleshooting
Common ComfyUI errors and fixes — OOM, missing nodes, dtype mismatches, black images, and debugging strategies
terway-troubleshooting
Troubleshoot Terway CNI issues in Kubernetes using Kubernetes events and Terway logs. Use when diagnosing "cni plugin not initialized", Pod create/delete failures, or ENI/IPAM problems in Terway (centralized or non-centralized IPAM).
assertion-troubleshooting
Phylax Credible Layer assertions troubleshooting. Diagnoses common assertion failures and non-triggering issues. Use when phylax/credible layer assertions fail unexpectedly or do not execute.
influxdb-cloud-automation
Automate Influxdb Cloud tasks via Rube MCP (Composio). Always search tools first for current schemas.
fluxguard-automation
Automate Fluxguard tasks via Rube MCP (Composio). Always search tools first for current schemas.
flux-txt2img
Build Flux txt2img workflows — Flux.1 Dev (SRPO), Flux 2 Klein 9B, Turbo LoRAs, FluxGuidance, and DualCLIPLoader patterns
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
terraform-engineer
Use when implementing infrastructure as code with Terraform across AWS, Azure, or GCP. Invoke for module development, state management, provider configuration, multi-environment workflows, infrastructure testing.