PyTorch

## Overview

25 stars

byComeOnOliver

View on GitHub Installation ↓

Best use case

PyTorch is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Overview

Teams using PyTorch should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/pytorch/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/TerminalSkills/skills/pytorch/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/pytorch/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How PyTorch Compares

Feature / Agent	PyTorch	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

## Overview

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# PyTorch

## Overview

PyTorch is a deep learning framework for building and training neural networks with dynamic computation graphs and automatic differentiation. It provides tensor operations with GPU acceleration, `nn.Module` for defining architectures, DataLoader for efficient data loading, mixed precision training for performance, and export tools (TorchScript, ONNX) for production deployment.

## Instructions

- When defining models, subclass `nn.Module` with `__init__` for layers and `forward` for computation, using `nn.Sequential` for simple stacks and custom forward logic for complex architectures.
- When training, implement the standard loop: forward pass, loss computation, `loss.backward()`, `optimizer.step()`, `optimizer.zero_grad()`, with gradient clipping via `clip_grad_norm_` for stability.
- When loading data, subclass `Dataset` with `__len__` and `__getitem__`, then use `DataLoader` with `num_workers=4` and `pin_memory=True` for GPU training throughput.
- When optimizing performance, use `torch.compile(model)` on PyTorch 2.0+ for 20-50% speedup, mixed precision with `torch.amp.autocast()` for halved memory and doubled throughput, and `DistributedDataParallel` for multi-GPU training.
- When doing transfer learning, load pretrained models from `torchvision.models` or Hugging Face, freeze the backbone, and replace the classifier head for your task.
- When deploying, use `torch.export()` or `torch.jit.trace()` for production, `torch.onnx.export()` for cross-framework compatibility, and `torch.quantization` for INT8 inference speedup.

## Examples

### Example 1: Fine-tune a vision model for image classification

**User request:** "Fine-tune a pretrained ResNet for classifying product images"

**Actions:**
1. Load `resnet50(weights=ResNet50_Weights.DEFAULT)` and freeze all layers except the final classifier
2. Replace the classifier head with `nn.Linear(2048, num_classes)`
3. Set up DataLoader with image augmentation transforms (RandomCrop, ColorJitter, Normalize)
4. Train with AdamW, CosineAnnealingLR scheduler, and mixed precision

**Output:** A fine-tuned image classifier with production-quality accuracy and efficient mixed-precision training.

### Example 2: Train a text classification model with Hugging Face

**User request:** "Build a sentiment analysis model using a pretrained transformer"

**Actions:**
1. Load `AutoModel.from_pretrained("bert-base-uncased")` with a classification head
2. Tokenize the dataset using `AutoTokenizer` and create a DataLoader
3. Fine-tune with AdamW, linear warmup scheduler, and gradient clipping
4. Export the trained model with `torch.export()` for production serving

**Output:** A sentiment analysis model fine-tuned on custom data and exported for production inference.

## Guidelines

- Use `torch.compile(model)` on PyTorch 2.0+ for a free 20-50% speedup with one line.
- Use `AdamW` over `Adam` for correct weight decay implementation with modern architectures.
- Use mixed precision (`torch.amp`) for any GPU training to halve memory and double throughput.
- Move data to device in the training loop, not in the Dataset, to keep Dataset device-agnostic.
- Use `model.eval()` and `torch.no_grad()` during inference to prevent unnecessary gradient computation.
- Use `pin_memory=True` in DataLoader when training on GPU to speed up CPU-to-GPU data transfer.
- Save `model.state_dict()` not the full model since state dicts are portable across code changes.

Related Skills

pytorch-model-trainer

from ComeOnOliver/skillshub

Pytorch Model Trainer - Auto-activating skill for ML Training. Triggers on: pytorch model trainer, pytorch model trainer Part of the ML Training skill category.

pytorch-patterns

from ComeOnOliver/skillshub

PyTorch深度学习模式与最佳实践，用于构建稳健、高效且可复现的训练流程、模型架构和数据加载。

TorchTitan - PyTorch Native Distributed LLM Pretraining

from ComeOnOliver/skillshub

## Quick start

torchforge: PyTorch-Native Agentic RL Library

from ComeOnOliver/skillshub

torchforge is Meta's PyTorch-native RL library that separates infrastructure concerns from algorithm concerns. It enables rapid RL research by letting you focus on algorithms while handling distributed training, inference, and weight sync automatically.

Skill: Use PyTorch FSDP2 (`fully_shard`) correctly in a training script

from ComeOnOliver/skillshub

This skill teaches a coding agent how to **add PyTorch FSDP2** to a training loop with correct initialization, sharding, mixed precision/offload configuration, and checkpointing.

PyTorch Geometric (PyG)

from ComeOnOliver/skillshub

## Overview

PyTorch Lightning

from ComeOnOliver/skillshub

## Overview

Daily Logs

from ComeOnOliver/skillshub

Record the user's daily activities, progress, decisions, and learnings in a structured, chronological format.

Socratic Method: The Dialectic Engine

from ComeOnOliver/skillshub

This skill transforms Claude into a Socratic agent — a cognitive partner who guides

Sokratische Methode: Die Dialektik-Maschine

from ComeOnOliver/skillshub

Dieser Skill verwandelt Claude in einen sokratischen Agenten — einen kognitiven Partner, der Nutzende durch systematisches Fragen zur Wissensentdeckung führt, anstatt direkt zu instruieren.

College Football Data (CFB)

from ComeOnOliver/skillshub

Before writing queries, consult `references/api-reference.md` for endpoints, conference IDs, team IDs, and data shapes.

College Basketball Data (CBB)

from ComeOnOliver/skillshub

Before writing queries, consult `references/api-reference.md` for endpoints, conference IDs, team IDs, and data shapes.