linux-troubleshooting
Linux system troubleshooting workflow for diagnosing and resolving system issues, performance problems, and service failures.
Best use case
linux-troubleshooting is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Linux system troubleshooting workflow for diagnosing and resolving system issues, performance problems, and service failures.
Teams using linux-troubleshooting should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/linux-troubleshooting/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How linux-troubleshooting Compares
| Feature / Agent | linux-troubleshooting | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Linux system troubleshooting workflow for diagnosing and resolving system issues, performance problems, and service failures.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Linux Troubleshooting Workflow ## Overview Specialized workflow for diagnosing and resolving Linux system issues including performance problems, service failures, network issues, and resource constraints. ## When to Use This Workflow Use this workflow when: - Diagnosing system performance issues - Troubleshooting service failures - Investigating network problems - Resolving disk space issues - Debugging application errors ## Workflow Phases ### Phase 1: Initial Assessment #### Skills to Invoke - `bash-linux` - Linux commands - `devops-troubleshooter` - Troubleshooting #### Actions 1. Check system uptime 2. Review recent changes 3. Identify symptoms 4. Gather error messages 5. Document findings #### Commands ```bash uptime hostnamectl cat /etc/os-release dmesg | tail -50 ``` #### Copy-Paste Prompts ``` Use @bash-linux to gather system information ``` ### Phase 2: Resource Analysis #### Skills to Invoke - `bash-linux` - Resource commands - `performance-engineer` - Performance analysis #### Actions 1. Check CPU usage 2. Analyze memory 3. Review disk space 4. Monitor I/O 5. Check network #### Commands ```bash top -bn1 | head -20 free -h df -h iostat -x 1 5 ``` #### Copy-Paste Prompts ``` Use @performance-engineer to analyze system resources ``` ### Phase 3: Process Investigation #### Skills to Invoke - `bash-linux` - Process commands - `server-management` - Process management #### Actions 1. List running processes 2. Identify resource hogs 3. Check process status 4. Review process trees 5. Analyze strace output #### Commands ```bash ps aux --sort=-%cpu | head -10 pstree -p lsof -p PID strace -p PID ``` #### Copy-Paste Prompts ``` Use @server-management to investigate processes ``` ### Phase 4: Log Analysis #### Skills to Invoke - `bash-linux` - Log commands - `error-detective` - Error detection #### Actions 1. Check system logs 2. Review application logs 3. Search for errors 4. Analyze log patterns 5. Correlate events #### Commands ```bash journalctl -xe tail -f /var/log/syslog grep -i error /var/log/* ``` #### Copy-Paste Prompts ``` Use @error-detective to analyze log files ``` ### Phase 5: Network Diagnostics #### Skills to Invoke - `bash-linux` - Network commands - `network-engineer` - Network troubleshooting #### Actions 1. Check network interfaces 2. Test connectivity 3. Analyze connections 4. Review firewall rules 5. Check DNS resolution #### Commands ```bash ip addr show ss -tulpn curl -v http://target dig domain ``` #### Copy-Paste Prompts ``` Use @network-engineer to diagnose network issues ``` ### Phase 6: Service Troubleshooting #### Skills to Invoke - `server-management` - Service management - `systematic-debugging` - Debugging #### Actions 1. Check service status 2. Review service logs 3. Test service restart 4. Verify dependencies 5. Check configuration #### Commands ```bash systemctl status service journalctl -u service -f systemctl restart service ``` #### Copy-Paste Prompts ``` Use @systematic-debugging to troubleshoot service issues ``` ### Phase 7: Resolution #### Skills to Invoke - `incident-responder` - Incident response - `bash-pro` - Fix implementation #### Actions 1. Implement fix 2. Verify resolution 3. Monitor stability 4. Document solution 5. Create prevention plan #### Copy-Paste Prompts ``` Use @incident-responder to implement resolution ``` ## Troubleshooting Checklist - [ ] System information gathered - [ ] Resources analyzed - [ ] Logs reviewed - [ ] Network tested - [ ] Services verified - [ ] Issue resolved - [ ] Documentation created ## Quality Gates - [ ] Root cause identified - [ ] Fix verified - [ ] Monitoring in place - [ ] Documentation complete ## Related Workflow Bundles - `os-scripting` - OS scripting - `bash-scripting` - Bash scripting - `cloud-devops` - DevOps
Related Skills
kubernetes-troubleshooting
Debug Kubernetes pods, services, networking, and scaling issues. Use this skill when troubleshooting K8s deployments, investigating pod failures, or diagnosing cluster problems.
flux-troubleshooting
Use when Flux resources show Ready False, reconciliation errors appear in logs, deployments fail to sync from Git, HelmRelease installations fail, source artifacts are not being fetched, or image automation is not updating tags
arc-runner-troubleshooting
Troubleshoot ARC (Actions Runner Controller) runners on Rackspace Spot Kubernetes. Diagnose stuck jobs, scaling issues, and cluster access. Activates on "runner", "ARC", "stuck job", "queued", "GitHub Actions", or "CI stuck".
administering-linux
Manage Linux systems covering systemd services, process management, filesystems, networking, performance tuning, and troubleshooting. Use when deploying applications, optimizing server performance, diagnosing production issues, or managing users and security on Linux servers.
troubleshooting
Common ComfyUI errors and fixes — OOM, missing nodes, dtype mismatches, black images, and debugging strategies
terway-troubleshooting
Troubleshoot Terway CNI issues in Kubernetes using Kubernetes events and Terway logs. Use when diagnosing "cni plugin not initialized", Pod create/delete failures, or ENI/IPAM problems in Terway (centralized or non-centralized IPAM).
assertion-troubleshooting
Phylax Credible Layer assertions troubleshooting. Diagnoses common assertion failures and non-triggering issues. Use when phylax/credible layer assertions fail unexpectedly or do not execute.
linux-shell-scripting
This skill should be used when the user asks to "create bash scripts", "automate Linux tasks", "monitor system resources", "backup files", "manage users", or "write production she...
bash-linux
Bash/Linux terminal patterns. Critical commands, piping, error handling, scripting. Use when working on macOS or Linux systems.
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
terraform-engineer
Use when implementing infrastructure as code with Terraform across AWS, Azure, or GCP. Invoke for module development, state management, provider configuration, multi-environment workflows, infrastructure testing.
terraform-diagrams
Generates architecture diagrams from Terraform code. Use when user has .tf files or asks to visualize Terraform infrastructure.