explore-dnn-model

Manual invocation only; use only when the user explicitly requests `explore-dnn-model` by name. Explore how to run a given DNN model checkpoint in the current Python environment by locating weights + upstream source code, resolving dependencies with user confirmation, running reproducible experiments under `tmp/`, and producing reports about I/O contracts, timing, and profiling.

7 stars

Best use case

explore-dnn-model is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Manual invocation only; use only when the user explicitly requests `explore-dnn-model` by name. Explore how to run a given DNN model checkpoint in the current Python environment by locating weights + upstream source code, resolving dependencies with user confirmation, running reproducible experiments under `tmp/`, and producing reports about I/O contracts, timing, and profiling.

Teams using explore-dnn-model should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/explore-dnn-model/SKILL.md --create-dirs "https://raw.githubusercontent.com/igamenovoer/magic-context/main/skills/research/explore-dnn-model/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/explore-dnn-model/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How explore-dnn-model Compares

Feature / Agentexplore-dnn-modelStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Manual invocation only; use only when the user explicitly requests `explore-dnn-model` by name. Explore how to run a given DNN model checkpoint in the current Python environment by locating weights + upstream source code, resolving dependencies with user confirmation, running reproducible experiments under `tmp/`, and producing reports about I/O contracts, timing, and profiling.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Explore DNN Model

## Minimum Required Inputs (Hard Requirement)

To use this skill, the user must provide:
- A model checkpoint / model file(s) as a **local** file or directory path (it may be outside the workspace).

If the user provides only the checkpoint path (no model name, repo link, or source code), proceed by:
1) Attempting to identify the model name/family from the checkpoint file/dir itself (filenames, adjacent configs/README, embedded metadata, `state_dict` key patterns, etc.).
2) Searching for the implementation in the workspace and/or alongside the checkpoint directory (e.g., nearby Python packages, inference scripts, config files).
3) If still not found, using the best-guess model name/family to search online for the canonical implementation, then cloning the upstream source into `tmp/<experiment-dir>/refs/` for investigation (prefer shallow clone; record URL + commit/tag used).

## Goals

This skill has three goals:

1) Verify that the given DNN model can work (inference or training; default focus is **inference**) in the *current* Python environment of the workspace.
2) Determine how to use it (inference or training; default is **inference**) by reading the upstream source code and producing minimal, reproducible runs.
3) Produce two reports:
   - **Experiment report** (programmatic): generated from `tmp/<experiment-dir>/outputs/` with minimal/no reasoning.
   - **Stakeholder report** (agent-written): generated by the agent from the experiment report + outputs/logs, with deeper analysis and recommendations.

The reports cover:
   - Input and output contracts (formats, shapes, dtypes, preprocessing/postprocessing)
   - Benchmarks and performance profiling (latency/throughput/memory, device details)
   - User-provided metrics/targets (e.g., accuracy, mAP, IoU, F1, latency budget), and whether/how they are met

Before changing anything, detect how the environment is managed by checking for:
- `pixi.toml` and/or `pyproject.toml` (Pixi-managed project)
- `.venv/` (venv-managed project)

## Dependency Policy (Ask Once, Then Apply)

If any dependency is missing:
- Do **not** install it automatically *without user confirmation*.
- List the missing packages (and versions/constraints if known) and ask the developer how to proceed.
- Provide clear options, let the developer choose, then proceed with the chosen approach.
- Once the developer confirms an approach, apply it for **all** newly required packages (no need to ask approval per package).

### Version Strategy

- First attempt: use the **latest versions** resolved by the selected package manager (`pixi`, `pip`, `uv`).
- If that fails (import/runtime errors, incompatibilities): fall back to the **specific versions/constraints** documented by the model’s upstream source code or docs.

### Preferred Options (in order)

**Pixi-managed env**
- Ask the user to choose one:
  - Modify the current Pixi environment by adding deps to the relevant manifest (`pixi.toml` / `pyproject.toml`).
  - Create a new Pixi environment specifically to test this model.
- Then use `pixi install`/`pixi run ...` to execute.
- Prefer **PyPI** packages over **conda-forge** when both are available.
- Avoid direct `pip install ...` into the Pixi environment unless the developer explicitly requests it.

**`.venv`-managed env**
Ask the user to choose one:
- Install deps via `pip` (or `uv pip`) into the current `.venv`.
- Create a new venv specifically for this model (keeps the repo venv clean).

## Inputs to Collect (ask if missing)

- Model name and/or upstream repo link and/or source code path (optional but speeds up identification)
- Model task/modality if unclear (classification/detection/segmentation/embedding/audio/video/etc.)
- Checkpoint path (file/dir) and format (`.pt`, `.pth`, `.onnx`, `.engine`, etc.)
- Any known I/O contract details (expected resolution, channel order, normalization, label mapping), if the user has them
- CPU-only requirement (only if the user explicitly requests CPU-only)
- Optional: user-provided metrics/targets to evaluate (quality and/or performance)

Notes:
- Determine framework/runtime automatically from checkpoint type + upstream code/docs + what’s available in the current Python environment.
- If hardware is unspecified, default to using hardware acceleration when available (CUDA GPU, ROCm GPU, Apple MPS, etc.). Use CPU-only only if the user requested it.
- If unspecified, the default objective is to confirm the model runs end-to-end from input → output (prefer real inputs found in the workspace; synthesize as a fallback) and record end-to-end timing.

## Core Workflow

### 0) Confirm artifacts and pick the target environment

- Confirm the minimum required inputs are present:
  - Checkpoint/model path is accessible locally (file/dir exists). It may be outside the workspace.
  - If model name/repo/source path is not provided, start by inferring it from the checkpoint and nearby files; if needed, locate it online and clone into `tmp/<experiment-dir>/refs/`.
- Detect environment type:
  - If both Pixi and `.venv` exist, ask the user which one should be treated as the “current” environment for this exploration.
- Device default:
  - If the user did not request CPU-only, use hardware acceleration when available (CUDA/ROCm/MPS/etc.).

### 1) Locate and read the upstream source code/docs

- First try to find the implementation locally:
  - Search the workspace and the checkpoint directory for source code, inference scripts, configs, and docs.
  - Prefer local source if it appears to be the canonical/official implementation for the checkpoint.
- If local source is not available or is clearly incomplete, use online search to find the canonical implementation:
  - Official GitHub repo, paper, model card, or vendor docs.
  - Check out the upstream repo under `tmp/<experiment-dir>/refs/<repo-name>` using a shallow clone (`--depth=1`), pinning a tag/commit when possible.
- Download/check out the relevant source code (pin a tag/commit when possible) and identify:
  - The exact inference entrypoints (scripts/modules), model class, preprocessing, postprocessing, and label mapping.
  - Any config files required to construct the model (YAML/JSON/TOML).
- Do not “guess” preprocessing/postprocessing: confirm from code and/or reference examples.

### 2) Derive required dependencies

Before running the model or changing the environment, determine the minimal dependencies required to run the model by using (in priority order):
- Upstream source code (setup files, `requirements*.txt`, `pyproject.toml`, import graph).
- Upstream docs/model card (pinned versions, known-good combos).
- Checkpoint type (e.g., `.onnx` implies ONNX Runtime; `.pt/.pth` implies PyTorch; `.engine` implies TensorRT).

Make a concise dependency list covering:
- Runtime/framework (e.g., `torch`, `onnxruntime`, `opencv-python`)
- Model-specific libs (e.g., `ultralytics`, `timm`, `transformers`, `mmengine`, etc.)
- Utility deps used by the official inference path (e.g., `numpy`, `Pillow`, `pyyaml`)
- Optional acceleration deps (CUDA/TensorRT) separated from the CPU baseline

### 3) Resolve missing dependencies (with user choice)

- Check whether each required dependency is available in the current environment.
- If anything is missing, ask the user which path to take:
  - **Pixi:** modify current manifest to add deps, or create a new Pixi env for this model.
  - **Venv:** install into current `.venv`, or create a new venv for this model.
- After the user confirms, apply the decision for all required packages (no per-package prompts).
- Use the **Version Strategy** above (latest first; fall back to pinned versions if needed).
- After dependency changes, run a quick smoke test:
  - Imports for the core runtime stack
  - Minimal “load model” path (without a full benchmark yet)

### 4) Ensure the checkpoint exists locally

- Do **not** download checkpoints automatically.
- Developers must provide checkpoints/model files (local file/dir paths).
- If the checkpoint is missing or only a URL is provided, ask the developer to download it and provide the local path.
- If the developer wants a conventional location, prefer `checkpoints/` (gitignored).
- Record provenance in a short note (based on what the developer provides):
  - Claimed source URL(s) or repo, version/commit/tag (if known), file size, and (if feasible) SHA256.

### 5) Create an experiment workspace under `tmp/`

Default experiment directory:

`<workspace>/tmp/<experiment-slug>-<time>`

If the user specifies a different location/name, use the user-provided one instead.

Create the standard directory layout:

```
tmp/<experiment-dir>/
  README.md     # experiment intent + directory guide (keep updated)
  refs/         # checked-out upstream repos (use shallow clone for online checkouts)
    README.md
  scripts/      # throwaway but reproducible scripts (committed if useful)
    README.md
  inputs/       # downloaded/synthesized test inputs
    README.md
  outputs/      # artifacts + machine-readable stats (e.g., `stats.json`)
    README.md
  logs/         # logs (stdout/stderr, profiling traces, command transcripts)
    README.md
  reports/      # markdown notes: what was tried, params, results
    README.md
    figures/    # images embedded in reports
    experiment-report.md
    stakeholder-report.md
```

Shell safety note (avoid accidental directory names):
- Do **not** use bash brace expansion to create these folders (e.g., `mkdir -p "$exp"/{refs,scripts,...}`), because quoting/spacing mistakes can create literal directories like `{refs,scripts,...}`.
- Prefer a simple loop or explicit `mkdir -p` calls, for example:

  ```
  exp="tmp/<experiment-dir>"
  mkdir -p "$exp"
  for d in refs scripts inputs outputs logs reports reports/figures; do
    mkdir -p "$exp/$d"
  done
  ```

Conventions:
- Use relative paths from `tmp/<experiment-dir>` in scripts so the folder is movable.
- Keep scripts small and single-purpose (`01_download_inputs.py`, `10_infer.py`, `20_visualize.py`, …).
- Run Python via the selected environment manager:
  - Pixi: `pixi run python ...`
  - Venv: use the venv’s Python (avoid system Python)

README requirements:
- Create `tmp/<experiment-dir>/README.md` to describe:
  - The intention of the experiment (what model, what checkpoint, what question you’re answering)
  - How to reproduce (one-line pointer to the primary script(s))
  - A brief map of what each top-level subdir contains
- Each top-level subdir must have its own `README.md` that:
  - Describes what belongs in the folder
  - Notes any important changes (append a short “Changes” section as you iterate)

### 6) Collect or synthesize inputs

- First try to find suitable inputs already present in the workspace (e.g., under `datasets/`, `downloads/`, or other project-specific data dirs) based on what you learned from the checkpoint/source code (task, modality, expected resolution, file types).
- If no suitable inputs exist locally, synthesize minimal inputs that satisfy the model contract (e.g., generated images, random tensors saved in the expected container format, short synthetic video).
- Save all chosen/generated inputs under `tmp/<experiment-dir>/inputs/`.

### 7) Run minimal, traceable inference experiments (default: inference + end-to-end timing)

- Start with a single known-good example (from upstream repo) if available.
- Save every “input → output” mapping:
  - Inputs: the exact file(s) used + preprocessing parameters.
  - Outputs: raw model outputs + any decoded/visualized artifacts.
  - Command line + environment notes (device, precision, batch size).
- Measure end-to-end timing by default:
  - At minimum: one cold run + a small number of warm runs (record mean/median).
- Persist stats that will appear in the report:
  - For any timing/profiling/memory/throughput numbers you plan to put into the report, also write a JSON version under `tmp/<experiment-dir>/outputs/` (e.g., `outputs/stats.json`).
- Capture logs by default:
  - Save stdout/stderr and command transcripts under `tmp/<experiment-dir>/logs/`.
- If the model is accessed via HTTP/gRPC, save request/response payloads (sanitized) under `reports/` and/or `outputs/`.

### 7b) (Optional) Training sanity check

If the user asks to validate training (or if inference is insufficient to validate “works”):
- Start with a minimal configuration (single batch / tiny subset) to confirm the forward + backward pass runs.
- Record key configs (optimizer, LR, batch size, mixed precision) and any dataset assumptions.
- Do not run long trainings unless the user explicitly requests it.

### 8) Produce reports

#### 8a) Ensure machine-readable report inputs exist (in `outputs/`)

Write/collect machine-readable files in `tmp/<experiment-dir>/outputs/` that the report generator can consume, at minimum:
- `stats.json` (timing/throughput/memory/profile numbers)
- A JSON describing key parameters used (preprocess/postprocess/runtime thresholds)
- A JSON describing the I/O contract (input expectations + output structure)
- A JSON listing key artifacts produced (paths to representative inputs/outputs)

Keep these JSON files as the source of truth for anything that will appear as “final stats” in the experiment report.

#### 8b) Generate `reports/experiment-report.md` programmatically

- Generate `tmp/<experiment-dir>/reports/experiment-report.md` by reading only `tmp/<experiment-dir>/outputs/` (and optionally `logs/` for pointers), with minimal/no reasoning.
- If images are part of the inputs/outputs, copy representative images into `tmp/<experiment-dir>/reports/figures/` and embed them in the markdown via relative paths (e.g., `figures/<name>.png`).

#### 8c) Write `reports/stakeholder-report.md` (agent-written)

- Read `reports/experiment-report.md` plus relevant `outputs/` and `logs/`.
- Produce `tmp/<experiment-dir>/reports/stakeholder-report.md` with deeper analysis that requires reasoning:
  - Interpret results vs expectations/targets
  - Call out risks, assumptions, and failure modes
  - Recommend next experiments and concrete integration guidance (if requested)
  - Summarize “go/no-go” criteria and what remains unknown

Also include:
- **Benchmark & profiling** results:
  - CPU/GPU model, RAM/VRAM, OS, Python version, key library versions
  - Latency breakdown if possible (preprocess / model / postprocess)
  - Throughput (items/s) and peak memory/VRAM
- **Stats JSON**:
  - For any stats included in the report, ensure the same values exist in a JSON file under `tmp/<experiment-dir>/outputs/` (e.g., `outputs/stats.json`).
- **User metrics** (if provided):
  - The metric definition + measurement method
  - Results on the chosen evaluation inputs
  - Any deltas vs the user’s targets and suggested next experiments

## Guardrails

- Do not commit large checkpoints or huge outputs; keep them under gitignored paths (`checkpoints/`, `tmp/`).
- Respect upstream licenses; record the repo URL + commit/tag in `reports/`.
- Avoid modifying runtime code under `src/` unless the user explicitly requests integration; keep exploration isolated to `tmp/<experiment-dir>`.

Related Skills

pixi-make-offline-channel

7
from igamenovoer/magic-context

Use when the user wants to create a self-hosted, offline-installable Conda channel (mirror) containing a specific subset of packages using Pixi.

pixi-make-cu-build-env

7
from igamenovoer/magic-context

Guides the agent to setup a new or existing Pixi environment for compiling C++ and CUDA code. It ensures the correct compilers, toolkits, and CMake configurations are in place for a robust user-space build.

pixi-install-nvidia

7
from igamenovoer/magic-context

Use when the user says "use pixi to install <some nvidia tool>" (or similar) and wants NVIDIA/CUDA/GPU packages installed via Pixi (no sudo/apt), e.g., CUDA toolkit pieces, cuDNN/NCCL, PyTorch CUDA builds, RAPIDS.

pei-docker-usage

7
from igamenovoer/magic-context

Helper for PeiDocker (`pei-docker-cli`). Trigger ONLY when the user explicitly requests PeiDocker usage OR when working within a PeiDocker-generated project (indicated by `user_config.yml`).

conan-basic-usage

7
from igamenovoer/magic-context

Basic operations for the Conan C++ package manager. Use when the user explicitly asks to 'use conan' for tasks like creating projects, installing dependencies, or building packages, or asks for 'how to' guidance on Conan setup.

openspec-ext-revise-by-decision

7
from igamenovoer/magic-context

Manual invocation only; use only when the user explicitly requests `openspec-ext-revise-by-decision` by exact name. Revise OpenSpec change artifacts from a review or decision document that contains questions plus `DECISION` blocks, applying chosen decisions from a review file such as `openspec/changes/<change>/review/review-*.md` back into proposal, design, specs, and tasks.

openspec-ext-review-plan

7
from igamenovoer/magic-context

Review an OpenSpec change (or a single OpenSpec change artifact file) for completeness, coherence, and alignment with existing system design; capture actionable feedback plus open questions; write a review report under the change directory (review/review-YYYYMMDD-HHMMSS.md).

openspec-ext-respond-to-review

7
from igamenovoer/magic-context

Read an OpenSpec review report critically, evaluate the reviewer's proposals and findings against the current change artifacts and repository context, and write developer-owned final decisions/responses back into the review document. Use when the user explicitly mentions `openspec` or points to a path under `openspec/` while asking to examine a review report carefully, decide open questions, respond to findings, fill `DECISION` blocks, respond to an OpenSpec review file, or record final answers in an OpenSpec review document without yet revising the proposal, design, specs, or tasks.

openspec-ext-hack-through-test

7
from igamenovoer/magic-context

Manual invocation only. OpenSpec-specific hack-through-testing workflow targeting production-level end-to-end paths using real data and real user workflows — not CI smoke/unit/integration tests. Three subskills: `propose` to create an OpenSpec change with HTT-ready test cases (automatic scripts and interactive guides) by invoking `openspec-propose` or `openspec-ff-change`, `revise` to update an existing OpenSpec change so its artifacts support hack-through-testing-driven implementation and testing, and `run` to exercise an implemented OpenSpec change through the full hack-through-testing loop (in-place by default, or in a disposable snapshot worktree when requested). Use when the user explicitly asks for `openspec-ext-hack-through-test`, points to `openspec/changes/...` while asking to propose, revise, run, exercise, or prepare work under hack-through-testing principles, or wants OpenSpec work shaped for fast blocker discovery through patch-forward testing.

openspec-ext-explain

7
from igamenovoer/magic-context

Create or update OpenSpec change explanation docs that capture developer-facing questions and answers under `openspec/changes/.../explain/`. Use when the user explicitly mentions `openspec` or points to a path under `openspec/` while asking to create, update, document, or maintain a Q&A, FAQ, explain note, or question-and-answer doc for an OpenSpec change based on user questions, implementation notes, review questions, or current chat context.

deepface-basic-usage

7
from igamenovoer/magic-context

Basic usage guide for the DeepFace library (face recognition, verification, analysis).

test-and-log

7
from igamenovoer/magic-context

Test a target (script, demo, pipeline, CLI command, integration) without modifying any source code, then write a structured log of the process, outcomes, anomalies, and issues. Use when the user says "test X and log", "run X and document findings", or "try X without changing code". Default log location is context/logs/TIMESTAMP-task-name/TIMESTAMP.md.