wandb-weave

Query and analyze W&B experiment data and Weave LLM traces using Python scripts. Use when working with Weights & Biases data, including (1) querying ML experiment runs, metrics, and hyperparameters, (2) analyzing LLM traces and evaluations, (3) creating W&B reports, (4) listing projects and entities.

16 stars

Best use case

wandb-weave is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Query and analyze W&B experiment data and Weave LLM traces using Python scripts. Use when working with Weights & Biases data, including (1) querying ML experiment runs, metrics, and hyperparameters, (2) analyzing LLM traces and evaluations, (3) creating W&B reports, (4) listing projects and entities.

Teams using wandb-weave should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/wandb-weave/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/data-ai/wandb-weave/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/wandb-weave/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How wandb-weave Compares

Feature / Agentwandb-weaveStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Query and analyze W&B experiment data and Weave LLM traces using Python scripts. Use when working with Weights & Biases data, including (1) querying ML experiment runs, metrics, and hyperparameters, (2) analyzing LLM traces and evaluations, (3) creating W&B reports, (4) listing projects and entities.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# W&B & Weave Data Tools

Python scripts to query W&B experiment data and Weave LLM traces.

## Prerequisites

```bash
pip install wandb weave
export WANDB_API_KEY="your-api-key"
```

## Workflow Decision Tree

```
What do you want to do?
│
├─ Query ML experiments (runs, metrics, sweeps)
│  └─ Run: scripts/query_runs.py
│
├─ Analyze LLM traces
│  ├─ Need trace data? → scripts/query_traces.py
│  └─ Just need count? → scripts/query_traces.py --count-only
│
├─ Create a report
│  └─ Run: scripts/create_report.py
│
└─ List projects
   └─ Run: scripts/list_projects.py
```

## Scripts

### query_runs.py

Query W&B experiment runs with filtering and sorting.

```bash
# List recent runs
python scripts/query_runs.py <entity> <project> --limit 10

# Filter by state
python scripts/query_runs.py my-team my-project --state finished

# Sort by metric (best first)
python scripts/query_runs.py my-team my-project --sort "-summary_metrics.accuracy"

# Custom filter
python scripts/query_runs.py my-team my-project --filter '{"config.model": "gpt-4"}'
```

| Option | Description |
|--------|-------------|
| `--limit N` | Max results (default: 20) |
| `--state` | Filter: running, finished, crashed, failed |
| `--sort` | Sort field (prefix `-` for desc) |
| `--filter` | JSON filter dict |
| `--output` | json or table |

### query_traces.py

Query Weave LLM traces with filtering.

```bash
# List recent traces
python scripts/query_traces.py <entity> <project> --limit 50

# Filter by status
python scripts/query_traces.py my-team my-project --status success

# Filter by model
python scripts/query_traces.py my-team my-project --model gpt-4o

# Find slow traces
python scripts/query_traces.py my-team my-project --min-latency 5000

# Count only
python scripts/query_traces.py my-team my-project --count-only
```

| Option | Description |
|--------|-------------|
| `--limit N` | Max results (default: 50) |
| `--status` | Filter: success, error, running |
| `--model` | Filter by model name |
| `--min-latency` | Min latency in ms |
| `--roots-only` | Only root traces |
| `--count-only` | Return count, not data |
| `--filter` | Custom JSON filter (advanced) |

For advanced filter syntax (when `--status`, `--model`, `--min-latency` are not enough), see [references/weave_filters.md](references/weave_filters.md).

### list_projects.py

List entities and projects.

```bash
# List all entities and projects
python scripts/list_projects.py

# List projects for specific entity
python scripts/list_projects.py my-team

# List entities only
python scripts/list_projects.py --entities-only
```

### create_report.py

Create W&B reports programmatically.

```bash
# Create with inline content
python scripts/create_report.py my-team my-project "Weekly Summary" \
    --content "## Results\n\n- Accuracy: 95%\n- Loss: 0.05"

# Create from markdown file
python scripts/create_report.py my-team my-project "Analysis" --file report.md

# With description
python scripts/create_report.py my-team my-project "Q4 Report" \
    --content "..." --description "Quarterly analysis"
```

## Common Workflows

### Analyze Experiment Performance

```bash
# 1. Find your project
python scripts/list_projects.py my-team

# 2. Query best runs
python scripts/query_runs.py my-team my-project \
    --state finished \
    --sort "-summary_metrics.accuracy" \
    --limit 10

# 3. Create summary report
python scripts/create_report.py my-team my-project "Best Runs" \
    --content "## Top 10 Runs by Accuracy\n\n..."
```

### Debug LLM Application

```bash
# 1. Count errors
python scripts/query_traces.py my-team my-project --status error --count-only

# 2. Get error details
python scripts/query_traces.py my-team my-project --status error --limit 20

# 3. Find slow traces
python scripts/query_traces.py my-team my-project --min-latency 5000
```

## Resources

- **Advanced trace filters**: Load [references/weave_filters.md](references/weave_filters.md) when `--filter` option is needed for complex queries not covered by built-in options

Related Skills

arweave-standards

16
from diegosouzapw/awesome-omni-skill

GitHub repository skill for ArweaveTeam/arweave-standards

arweave-bridge

16
from diegosouzapw/awesome-omni-skill

ZigZag Exchange Arweave Bridge - Pay with zkSync stablecoins (USDC/USDT/DAI) for permanent Arweave storage. Use for building dApps needing decentralized file storage, NFT metadata permanence, or Layer 2 storage solutions.

arweave-ao-cookbook

16
from diegosouzapw/awesome-omni-skill

Build decentralized applications on AO - a permanent, decentralized compute platform using actor model for parallel processes with native message-passing and permanent storage on Arweave

bgo

16
from diegosouzapw/awesome-omni-skill

Automated Blender build-go workflow. Automatically builds, removes old version, installs, enables, and launches Blender with your extension/add-on. Use when you want to quickly test changes, execute complete build-to-launch cycle, or run custom packaging scripts with automatic Blender launch.

Coding & Development

maintenance

16
from diegosouzapw/awesome-omni-skill

Cleans up and organizes project files. Use when user mentions '整理', 'cleanup', 'アーカイブ', 'archive', '肥大化', 'Plans.md', 'session-log', or asks to clean up old tasks, archive completed items, or organize files. Do NOT load for: 実装作業, レビュー, 新機能開発, デプロイ.

hello-skill

16
from diegosouzapw/awesome-omni-skill

每次对话开始时,声明"[Skills✏️已加载]"

zylvie-automation

16
from diegosouzapw/awesome-omni-skill

Automate Zylvie tasks via Rube MCP (Composio). Always search tools first for current schemas.

zoominfo-automation

16
from diegosouzapw/awesome-omni-skill

Automate Zoominfo tasks via Rube MCP (Composio). Always search tools first for current schemas.

zoho-invoice-automation

16
from diegosouzapw/awesome-omni-skill

Automate Zoho Invoice tasks via Rube MCP (Composio): invoices, estimates, expenses, clients, and payment tracking. Always search tools first for current schemas.

zoho-inventory-automation

16
from diegosouzapw/awesome-omni-skill

Automate Zoho Inventory tasks via Rube MCP (Composio): items, orders, warehouses, shipments, and stock management. Always search tools first for current schemas.

zoho-bigin-automation

16
from diegosouzapw/awesome-omni-skill

Automate Zoho Bigin tasks via Rube MCP (Composio): pipelines, contacts, companies, products, and small business CRM. Always search tools first for current schemas.

zoho_desk-automation

16
from diegosouzapw/awesome-omni-skill

Zoho Desk automation via Rube MCP -- toolkit not currently available in Composio; no ZOHO_DESK_ tools found