linux-troubleshooting

Linux system troubleshooting workflow for diagnosing and resolving system issues, performance problems, and service failures.

16 stars

Best use case

linux-troubleshooting is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Linux system troubleshooting workflow for diagnosing and resolving system issues, performance problems, and service failures.

Teams using linux-troubleshooting should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/linux-troubleshooting/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/devops/linux-troubleshooting/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/linux-troubleshooting/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How linux-troubleshooting Compares

Feature / Agentlinux-troubleshootingStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Linux system troubleshooting workflow for diagnosing and resolving system issues, performance problems, and service failures.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Linux Troubleshooting Workflow

## Overview

Specialized workflow for diagnosing and resolving Linux system issues including performance problems, service failures, network issues, and resource constraints.

## When to Use This Workflow

Use this workflow when:
- Diagnosing system performance issues
- Troubleshooting service failures
- Investigating network problems
- Resolving disk space issues
- Debugging application errors

## Workflow Phases

### Phase 1: Initial Assessment

#### Skills to Invoke
- `bash-linux` - Linux commands
- `devops-troubleshooter` - Troubleshooting

#### Actions
1. Check system uptime
2. Review recent changes
3. Identify symptoms
4. Gather error messages
5. Document findings

#### Commands
```bash
uptime
hostnamectl
cat /etc/os-release
dmesg | tail -50
```

#### Copy-Paste Prompts
```
Use @bash-linux to gather system information
```

### Phase 2: Resource Analysis

#### Skills to Invoke
- `bash-linux` - Resource commands
- `performance-engineer` - Performance analysis

#### Actions
1. Check CPU usage
2. Analyze memory
3. Review disk space
4. Monitor I/O
5. Check network

#### Commands
```bash
top -bn1 | head -20
free -h
df -h
iostat -x 1 5
```

#### Copy-Paste Prompts
```
Use @performance-engineer to analyze system resources
```

### Phase 3: Process Investigation

#### Skills to Invoke
- `bash-linux` - Process commands
- `server-management` - Process management

#### Actions
1. List running processes
2. Identify resource hogs
3. Check process status
4. Review process trees
5. Analyze strace output

#### Commands
```bash
ps aux --sort=-%cpu | head -10
pstree -p
lsof -p PID
strace -p PID
```

#### Copy-Paste Prompts
```
Use @server-management to investigate processes
```

### Phase 4: Log Analysis

#### Skills to Invoke
- `bash-linux` - Log commands
- `error-detective` - Error detection

#### Actions
1. Check system logs
2. Review application logs
3. Search for errors
4. Analyze log patterns
5. Correlate events

#### Commands
```bash
journalctl -xe
tail -f /var/log/syslog
grep -i error /var/log/*
```

#### Copy-Paste Prompts
```
Use @error-detective to analyze log files
```

### Phase 5: Network Diagnostics

#### Skills to Invoke
- `bash-linux` - Network commands
- `network-engineer` - Network troubleshooting

#### Actions
1. Check network interfaces
2. Test connectivity
3. Analyze connections
4. Review firewall rules
5. Check DNS resolution

#### Commands
```bash
ip addr show
ss -tulpn
curl -v http://target
dig domain
```

#### Copy-Paste Prompts
```
Use @network-engineer to diagnose network issues
```

### Phase 6: Service Troubleshooting

#### Skills to Invoke
- `server-management` - Service management
- `systematic-debugging` - Debugging

#### Actions
1. Check service status
2. Review service logs
3. Test service restart
4. Verify dependencies
5. Check configuration

#### Commands
```bash
systemctl status service
journalctl -u service -f
systemctl restart service
```

#### Copy-Paste Prompts
```
Use @systematic-debugging to troubleshoot service issues
```

### Phase 7: Resolution

#### Skills to Invoke
- `incident-responder` - Incident response
- `bash-pro` - Fix implementation

#### Actions
1. Implement fix
2. Verify resolution
3. Monitor stability
4. Document solution
5. Create prevention plan

#### Copy-Paste Prompts
```
Use @incident-responder to implement resolution
```

## Troubleshooting Checklist

- [ ] System information gathered
- [ ] Resources analyzed
- [ ] Logs reviewed
- [ ] Network tested
- [ ] Services verified
- [ ] Issue resolved
- [ ] Documentation created

## Quality Gates

- [ ] Root cause identified
- [ ] Fix verified
- [ ] Monitoring in place
- [ ] Documentation complete

## Related Workflow Bundles

- `os-scripting` - OS scripting
- `bash-scripting` - Bash scripting
- `cloud-devops` - DevOps

Related Skills

kubernetes-troubleshooting

16
from diegosouzapw/awesome-omni-skill

Debug Kubernetes pods, services, networking, and scaling issues. Use this skill when troubleshooting K8s deployments, investigating pod failures, or diagnosing cluster problems.

flux-troubleshooting

16
from diegosouzapw/awesome-omni-skill

Use when Flux resources show Ready False, reconciliation errors appear in logs, deployments fail to sync from Git, HelmRelease installations fail, source artifacts are not being fetched, or image automation is not updating tags

arc-runner-troubleshooting

16
from diegosouzapw/awesome-omni-skill

Troubleshoot ARC (Actions Runner Controller) runners on Rackspace Spot Kubernetes. Diagnose stuck jobs, scaling issues, and cluster access. Activates on "runner", "ARC", "stuck job", "queued", "GitHub Actions", or "CI stuck".

administering-linux

16
from diegosouzapw/awesome-omni-skill

Manage Linux systems covering systemd services, process management, filesystems, networking, performance tuning, and troubleshooting. Use when deploying applications, optimizing server performance, diagnosing production issues, or managing users and security on Linux servers.

troubleshooting

16
from diegosouzapw/awesome-omni-skill

Common ComfyUI errors and fixes — OOM, missing nodes, dtype mismatches, black images, and debugging strategies

terway-troubleshooting

16
from diegosouzapw/awesome-omni-skill

Troubleshoot Terway CNI issues in Kubernetes using Kubernetes events and Terway logs. Use when diagnosing "cni plugin not initialized", Pod create/delete failures, or ENI/IPAM problems in Terway (centralized or non-centralized IPAM).

assertion-troubleshooting

16
from diegosouzapw/awesome-omni-skill

Phylax Credible Layer assertions troubleshooting. Diagnoses common assertion failures and non-triggering issues. Use when phylax/credible layer assertions fail unexpectedly or do not execute.

linux-shell-scripting

16
from diegosouzapw/awesome-omni-skill

This skill should be used when the user asks to "create bash scripts", "automate Linux tasks", "monitor system resources", "backup files", "manage users", or "write production she...

bash-linux

16
from diegosouzapw/awesome-omni-skill

Bash/Linux terminal patterns. Critical commands, piping, error handling, scripting. Use when working on macOS or Linux systems.

bgo

10
from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

terraform-engineer

16
from diegosouzapw/awesome-omni-skill

Use when implementing infrastructure as code with Terraform across AWS, Azure, or GCP. Invoke for module development, state management, provider configuration, multi-environment workflows, infrastructure testing.

terraform-diagrams

16
from diegosouzapw/awesome-omni-skill

Generates architecture diagrams from Terraform code. Use when user has .tf files or asks to visualize Terraform infrastructure.