Node Tuning Helper Scripts
Generate tuned manifests and evaluate node tuning snapshots
Best use case
Node Tuning Helper Scripts is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Generate tuned manifests and evaluate node tuning snapshots
Teams using Node Tuning Helper Scripts should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/node-tuning-helper-scripts/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How Node Tuning Helper Scripts Compares
| Feature / Agent | Node Tuning Helper Scripts | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Generate tuned manifests and evaluate node tuning snapshots
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Node Tuning Helper Scripts
Detailed instructions for invoking the helper utilities that back `/node-tuning` commands:
- `generate_tuned_profile.py` renders Tuned manifests (`tuned.openshift.io/v1`).
- `analyze_node_tuning.py` inspects live nodes or sosreports for tuning gaps.
## When to Use These Scripts
- Translate structured command inputs into Tuned manifests for the Node Tuning Operator.
- Iterate on generated YAML outside the assistant or integrate the generator into automation.
- Analyze CPU isolation, IRQ affinity, huge pages, sysctl values, and networking counters from live clusters or archived sosreports.
## Prerequisites
- Python 3.8 or newer (`python3 --version`).
- Repository checkout so the scripts under `plugins/node-tuning/skills/scripts/` are accessible.
- Optional: `oc` CLI when validating or applying manifests.
- Optional: Extracted sosreport directory when running the analysis script offline.
- Optional (remote analysis): `oc` CLI access plus a valid `KUBECONFIG` when capturing `/proc`/`/sys` or sosreport via `oc debug node/<name>`. The sosreport workflow pulls the `registry.redhat.io/rhel9/support-tools` image (override with `--toolbox-image` or `TOOLBOX_IMAGE`) and requires registry access. HTTP(S) proxy env vars from the host are forwarded automatically when present, but using a proxy is optional.
---
## Script: `generate_tuned_profile.py`
### Implementation Steps
1. **Collect Inputs**
- `--profile-name`: Tuned resource name.
- `--summary`: `[main]` section summary.
- Repeatable options: `--include`, `--main-option`, `--variable`, `--sysctl`, `--section` (`SECTION:KEY=VALUE`).
- Target selectors: `--machine-config-label key=value`, `--match-label key[=value]`.
- Optional: `--priority` (default 20), `--namespace`, `--output`, `--dry-run`.
- Use `--list-nodes`/`--node-selector` to inspect nodes and `--label-node NODE:KEY[=VALUE]` (plus `--overwrite-labels`) to tag machines.
2. **Inspect or Label Nodes (optional)**
```bash
# List all worker nodes
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py --list-nodes --node-selector "node-role.kubernetes.io/worker" --skip-manifest
# Label a specific node for the worker-hp pool
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
--label-node ip-10-0-1-23.ec2.internal:node-role.kubernetes.io/worker-hp= \
--overwrite-labels \
--skip-manifest
```
3. **Render the Manifest**
```bash
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
--profile-name "$PROFILE" \
--summary "$SUMMARY" \
--sysctl net.core.netdev_max_backlog=16384 \
--match-label tuned.openshift.io/custom-net \
--output .work/node-tuning/$PROFILE/tuned.yaml
```
- Omit `--output` to write `<profile-name>.yaml` in the current directory.
- Add `--dry-run` to print the manifest to stdout.
4. **Review Output**
- Inspect the generated YAML for accuracy.
- Optionally format with `yq` or open in an editor for readability.
5. **Validate and Apply**
- Dry-run: `oc apply --server-dry-run=client -f <manifest>`.
- Apply: `oc apply -f <manifest>`.
### Error Handling
- Missing required options raise `ValueError` with descriptive messages.
- The script exits non-zero when no target selectors (`--machine-config-label` or `--match-label`) are supplied.
- Invalid key/value or section inputs identify the failing argument explicitly.
### Examples
```bash
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
--profile-name realtime-worker \
--summary "Realtime tuned profile" \
--include openshift-node --include realtime \
--variable isolated_cores=1 \
--section bootloader:cmdline_ocp_realtime=+systemd.cpu_affinity=${not_isolated_cores_expanded} \
--machine-config-label machineconfiguration.openshift.io/role=worker-rt \
--priority 25 \
--output .work/node-tuning/realtime-worker/tuned.yaml
```
```bash
python3 plugins/node-tuning/skills/scripts/generate_tuned_profile.py \
--profile-name openshift-node-hugepages \
--summary "Boot time configuration for hugepages" \
--include openshift-node \
--section bootloader:cmdline_openshift_node_hugepages="hugepagesz=2M hugepages=50" \
--machine-config-label machineconfiguration.openshift.io/role=worker-hp \
--priority 30 \
--output .work/node-tuning/openshift-node-hugepages/hugepages-tuned-boottime.yaml
```
---
## Script: `analyze_node_tuning.py`
### Purpose
Inspect either a live node (`/proc`, `/sys`) or an extracted sosreport snapshot for tuning signals (CPU isolation, IRQ affinity, huge pages, sysctl state, networking counters) and emit actionable recommendations.
### Usage Patterns
- **Live node analysis**
```bash
python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py --format markdown
```
- **Remote analysis via oc debug**
```bash
python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
--node worker-rt-0 \
--kubeconfig ~/.kube/prod \
--format markdown
```
- **Collect sosreport via oc debug and analyze locally**
```bash
python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
--node worker-rt-0 \
--toolbox-image registry.example.com/support-tools:latest \
--sosreport-arg "--case-id=01234567" \
--sosreport-output .work/node-tuning/sosreports \
--format json
```
- **Offline sosreport analysis**
```bash
python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
--sosreport /path/to/sosreport-2025-10-20
```
- **Automation-friendly JSON**
```bash
python3 plugins/node-tuning/skills/scripts/analyze_node_tuning.py \
--sosreport /path/to/sosreport \
--format json --output .work/node-tuning/node-analysis.json
```
### Implementation Steps
1. **Select data source**
- Provide `--node <name>` (with optional `--kubeconfig` / `--oc-binary`). By default the helper runs `sosreport` remotely from inside the RHCOS toolbox container (`registry.redhat.io/rhel9/support-tools`). Override the image with `--toolbox-image`, extend the sosreport command with `--sosreport-arg`, or disable the curated OpenShift flags via `--skip-default-sosreport-flags`. Pass `--no-collect-sosreport` to fall back to the direct `/proc` snapshot mode.
- Provide `--sosreport <dir>` for archived diagnostics; detection finds embedded `proc/` and `sys/`.
- Omit both switches to query the live filesystem (defaults to `/proc` and `/sys`).
- Override paths with `--proc-root` or `--sys-root` when the layout differs.
2. **Run analysis**
- The script parses `cpuinfo`, kernel cmdline parameters (`isolcpus`, `nohz_full`, `tuned.non_isolcpus`), default IRQ affinities, huge page counters, sysctl values (net, vm, kernel), transparent hugepage settings, `netstat`/`sockstat` counters, and `ps` snapshots (when available in sosreport).
3. **Review the report**
- Markdown output groups findings by section (System Overview, CPU & Isolation, Huge Pages, Sysctl Highlights, Network Signals, IRQ Affinity, Process Snapshot) and lists recommendations.
- JSON output contains the same information in structured form for pipelines or dashboards.
4. **Act on recommendations**
- Apply Tuned profiles, MachineConfig updates, or manual sysctl/irqbalance adjustments.
- Feed actionable items back into `/node-tuning:generate-tuned-profile` to codify desired state.
### Error Handling
- Missing `proc/` or `sys/` directories trigger descriptive errors.
- Unreadable files are skipped gracefully and noted in observations where relevant.
- Non-numeric sysctl values are flagged for manual investigation.
### Example Output (Markdown excerpt)
```
# Node Tuning Analysis
## System Overview
- Hostname: worker-rt-1
- Kernel: 4.18.0-477.el8
- NUMA nodes: 2
- Kernel cmdline: `BOOT_IMAGE=... isolcpus=2-15 tuned.non_isolcpus=0-1`
## CPU & Isolation
- Logical CPUs: 32
- Physical cores: 16 across 2 socket(s)
- SMT detected: yes
- Isolated CPUs: 2-15
...
## Recommended Actions
- Configure net.core.netdev_max_backlog (>=32768) to accommodate bursty NIC traffic.
- Transparent Hugepages are not disabled (`[never]` not selected). Consider setting to `never` for latency-sensitive workloads.
- 4 IRQs overlap isolated CPUs. Relocate interrupt affinities using tuned profiles or irqbalance.
```
### Follow-up Automation Ideas
- Persist JSON results in `.work/node-tuning/<host>/analysis.json` for historical tracing.
- Gate upgrades by comparing recommendations across nodes.
- Integrate with CI jobs that validate cluster tuning post-change.Related Skills
nodejs-javascript-vitest
Guidelines for writing Node.js and JavaScript code with Vitest testing Triggers on: **/*.js, **/*.mjs, **/*.cjs
nodejs-best-practices
Node.js development principles and decision-making. Framework selection, async patterns, security, and architecture. Teaches thinking, not copying.
nodejs-backend-typescript
Node.js backend development with TypeScript, Express/Fastify servers, routing, middleware, and database integration
nodejs-backend-patterns
Build production-ready Node.js backend services with Express/Fastify, implementing middleware patterns, error handling, authentication, database integration, and API design best practices. Use when creating Node.js servers, REST APIs, GraphQL backends, or microservices architectures.
n8n-node-configuration
Operation-aware node configuration guidance. Use when configuring nodes, understanding property dependencies, determining required fields, choosing between get_node detail levels, or learning common configuration patterns by node type.
libpdf-helper
Work with @libpdf/core - modern TypeScript PDF library for parsing, modifying, and generating PDFs. Use when (1) starting new @libpdf/core project, (2) migrating from pdf-lib/pdf.js/pdfkit, (3) understanding @libpdf/core API, (4) solving PDF tasks (forms, signatures, encryption, merging, text extraction), or (5) choosing between PDF libraries.
hashnode-automation
Automate Hashnode tasks via Rube MCP (Composio). Always search tools first for current schemas.
fetching-youtube-transcripts
Fetch transcripts and subtitles from YouTube videos using youtube-transcript-api. Use when extracting video transcripts, listing available languages, translating captions, or processing YouTube content for summarization or analysis.
backend-nodejs
Node.js/TypeScript backend expert. Handles Express/Fastify API routes, TypeScript strict mode, Prisma ORM, Zod validation, error handling, configuration management. Use when project is Node.js backend (package.json + TypeScript server).
Backend Node.js Expert
专注于 Node.js 后端开发模式与最佳实践。
Assertion Helper
Guide for writing effective test assertions with clear, meaningful error messages across different testing frameworks
smithnode
P2P blockchain for AI agents. Proof of Cognition. Run a validator, solve puzzles, earn SMITH tokens.