training-hub
Fine-tune LLMs using Red Hat training-hub library with SFT, LoRA, and OSFT algorithms. Use when preparing JSONL datasets, running training jobs, configuring hardware, scaling to clusters, evaluating models, or deploying with vLLM.
Best use case
training-hub is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Fine-tune LLMs using Red Hat training-hub library with SFT, LoRA, and OSFT algorithms. Use when preparing JSONL datasets, running training jobs, configuring hardware, scaling to clusters, evaluating models, or deploying with vLLM.
Teams using training-hub should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/training-hub/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How training-hub Compares
| Feature / Agent | training-hub | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Fine-tune LLMs using Red Hat training-hub library with SFT, LoRA, and OSFT algorithms. Use when preparing JSONL datasets, running training jobs, configuring hardware, scaling to clusters, evaluating models, or deploying with vLLM.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Training Hub
Red Hat's unified library for LLM post-training: SFT, LoRA, and OSFT (continual learning).
## Quick Reference
| Task | Command |
|------|---------|
| Recommend config | `python scripts/recommend_config.py --model <model> --hardware <hw>` |
| Estimate memory | `python scripts/estimate_memory.py --model <model> --method sft --hardware h100` |
| Validate dataset | `python scripts/validate_dataset.py data.jsonl` |
| Full fine-tuning | `from training_hub import sft` |
| LoRA training | `from training_hub import lora_sft` |
| OSFT (continual) | `from training_hub import osft` |
## Installation
```bash
pip install training-hub # Basic
pip install training-hub[lora] # LoRA with Unsloth (2x faster)
pip install training-hub[cuda] --no-build-isolation # CUDA support
```
## Get Started Fast
```bash
# Get optimal config for your hardware
python scripts/recommend_config.py \
--model meta-llama/Llama-3.1-8B-Instruct \
--hardware rtx-5090
```
## Data Format
Training data must be JSONL with message structure:
```json
{"messages": [{"role": "user", "content": "Q"}, {"role": "assistant", "content": "A"}]}
```
**Validate before training:**
```bash
python scripts/validate_dataset.py ./training_data.jsonl
```
For data preparation details, see [DATA-FORMATS.md](DATA-FORMATS.md).
## Training Methods
### Supervised Fine-Tuning (SFT)
Full-parameter fine-tuning. Requires significant VRAM.
```python
from training_hub import sft
result = sft(
model_path="Qwen/Qwen2.5-7B-Instruct",
data_path="./training_data.jsonl",
ckpt_output_dir="./checkpoints",
num_epochs=3,
effective_batch_size=8,
learning_rate=2e-5,
max_seq_len=2048,
max_tokens_per_gpu=45000,
)
```
### LoRA Fine-Tuning
Memory-efficient adaptation (up to 2x faster, 70% less VRAM):
```python
from training_hub import lora_sft
result = lora_sft(
model_path="Qwen/Qwen2.5-7B-Instruct",
data_path="./training_data.jsonl",
ckpt_output_dir="./outputs",
lora_r=16,
lora_alpha=32,
num_epochs=3,
learning_rate=2e-4,
)
```
**QLoRA (4-bit):** Add `load_in_4bit=True` for large models on limited VRAM.
### OSFT (Continual Learning)
Adapt without catastrophic forgetting:
```python
from training_hub import osft
result = osft(
model_path="meta-llama/Llama-3.1-8B-Instruct",
data_path="./domain_data.jsonl",
ckpt_output_dir="./checkpoints",
unfreeze_rank_ratio=0.25,
effective_batch_size=16,
learning_rate=2e-5,
)
```
For all parameters, see [ALGORITHMS.md](ALGORITHMS.md).
## Hardware Support
| Hardware | VRAM | Best For |
|----------|------|----------|
| RTX 5090 | 32GB | 8B LoRA, 70B QLoRA |
| DGX Spark | 128GB | 70B SFT |
| H100 | 80GB | 14B SFT, 70B LoRA |
| 8×H100 | 640GB | 70B SFT |
```bash
# Check if your config fits
python scripts/estimate_memory.py \
--model meta-llama/Llama-3.1-70B-Instruct \
--method lora \
--hardware h100 \
--num-gpus 8
```
For hardware-specific configs, see [HARDWARE.md](HARDWARE.md).
## Scaling
**Multi-GPU:**
```python
result = sft(..., nproc_per_node=8)
```
**Multi-node:**
```python
result = sft(..., nnodes=2, node_rank=0, nproc_per_node=8, rdzv_endpoint="0.0.0.0:29500")
```
For Slurm, Kubernetes, and datacenter deployments, see [SCALE.md](SCALE.md).
## Algorithm Selection
| Scenario | Method |
|----------|--------|
| First-time fine-tuning, large dataset | SFT |
| Memory constrained | LoRA |
| Very large model (70B+), limited VRAM | LoRA + QLoRA |
| Preserve existing capabilities | OSFT |
| Domain adaptation, small dataset | OSFT |
## Documentation
| Topic | File |
|-------|------|
| Hardware profiles & configs | [HARDWARE.md](HARDWARE.md) |
| All algorithm parameters | [ALGORITHMS.md](ALGORITHMS.md) |
| Data formats & conversion | [DATA-FORMATS.md](DATA-FORMATS.md) |
| Datacenter & cluster setup | [SCALE.md](SCALE.md) |
| Model evaluation | [EVALUATION.md](EVALUATION.md) |
| vLLM inference & serving | [INFERENCE.md](INFERENCE.md) |
| Advanced techniques | [ADVANCED.md](ADVANCED.md) |
| Model-specific configs | [MODELS.md](MODELS.md) |
| Troubleshooting | [TROUBLESHOOTING.md](TROUBLESHOOTING.md) |
| Distributed training | [DISTRIBUTED.md](DISTRIBUTED.md) |
## Utility Scripts
| Script | Purpose |
|--------|---------|
| `recommend_config.py` | Generate optimal config for model + hardware |
| `estimate_memory.py` | Estimate GPU memory requirements |
| `validate_dataset.py` | Validate JSONL dataset format |
| `convert_to_jsonl.py` | Convert CSV, Alpaca, ShareGPT to JSONL |
## Troubleshooting
**CUDA OOM:** Reduce `max_tokens_per_gpu`, use LoRA + QLoRA, or add GPUs
**Dataset errors:** Run `python scripts/validate_dataset.py` first
**LoRA multi-GPU:** Requires `torchrun --nproc-per-node=N script.py`
**Training diverges:** Lower `learning_rate` (try 1e-5 for SFT, 1e-4 for LoRA)
For more, see [TROUBLESHOOTING.md](TROUBLESHOOTING.md).Related Skills
when-training-neural-networks-use-flow-nexus-neural
This SOP provides a systematic workflow for training and deploying neural networks using Flow Nexus platform with distributed E2B sandboxes. It covers architecture selection, distributed training, ...
atft-training
Run and monitor ATFT-GAT-FAN training loops, hyper-parameter sweeps, and safety modes on A100 GPUs.
ai-training-data-generation
Generate high-quality training datasets from documents, text corpora, and structured content. Use when creating AI training data from dictionaries, documents, or when generating examples for machine learning models. Optimized for low-resource languages and domain-specific knowledge extraction.
qwen_training_data_miner_prototype
Qwen Training Data Miner (Prototype)
account-aware-training
Add account state (P&L, win rate, drawdown) to RL observations + drawdown penalty in rewards. Trigger when: (1) model needs account awareness, (2) training should penalize drawdowns, (3) upgrading obs_dim 5300→5600.
agentdb-reinforcement-learning-training
AgentDB Reinforcement Learning Training operates on 3 fundamental principles:
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
customer-discovery
Find where potential customers discuss problems online and extract their language patterns. Provides starting points for community research, not exhaustive coverage.
create-prd
This skill should be used when the user asks to "创建PRD", "写产品需求文档", "生成PRD", "新建PRD", "create PRD", "write product requirements document", or mentions "产品需求文档", "PRD模板". Automatically generates comprehensive Chinese PRD documents following 2026 best practices.
Create Jira Feature
Implementation guide for creating Jira features representing strategic objectives and market problems
create-feature
Creates Features following the T-Minus-15 process template. Features represent significant deliverables that contain multiple User Stories. Includes proper metadata, MoSCoW prioritization, effort estimates, deliverables, and benefit hypothesis.
create-feature-branch
Create properly named feature branch from development with remote tracking, following WescoBar naming conventions and git best practices