braintrust-analyze

Analyze Claude Code sessions via Braintrust

422 stars

Best use case

braintrust-analyze is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Analyze Claude Code sessions via Braintrust

Teams using braintrust-analyze should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/braintrust-analyze/SKILL.md --create-dirs "https://raw.githubusercontent.com/vibeeval/vibecosystem/main/skills/braintrust-analyze/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/braintrust-analyze/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How braintrust-analyze Compares

Feature / Agentbraintrust-analyzeStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Analyze Claude Code sessions via Braintrust

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Braintrust Analysis

Analyze your Claude Code sessions for patterns, issues, and insights using Braintrust tracing data.

## When to Use

- After completing a complex task (retrospective)
- When debugging why something failed
- Weekly review of productivity patterns
- Finding opportunities to create new skills
- Understanding token usage trends

## Commands

Run from the project directory:

```bash
# Analyze last session - summary with tool/agent/skill breakdown
uv run python -m runtime.harness scripts/braintrust_analyze.py --last-session

# List recent sessions
uv run python -m runtime.harness scripts/braintrust_analyze.py --sessions 5

# Agent usage statistics (last 7 days)
uv run python -m runtime.harness scripts/braintrust_analyze.py --agent-stats

# Skill usage statistics (last 7 days)
uv run python -m runtime.harness scripts/braintrust_analyze.py --skill-stats

# Detect loops - find repeated tool patterns (>5 same tool calls)
uv run python -m runtime.harness scripts/braintrust_analyze.py --detect-loops

# Replay specific session - show full sequence of actions
uv run python -m runtime.harness scripts/braintrust_analyze.py --replay <session-id>

# Weekly summary - daily activity breakdown
uv run python -m runtime.harness scripts/braintrust_analyze.py --weekly-summary

# Token trends - usage over time
uv run python -m runtime.harness scripts/braintrust_analyze.py --token-trends
```

## Options

- `--project NAME` - Braintrust project name (default: agentica)

## What You'll Learn

### Session Analysis
- Tool usage breakdown
- Agent spawns (plan-agent, debug-agent, etc.)
- Skill activations (/commit, /research, etc.)
- Token consumption estimates

### Loop Detection
Find sessions where the same tool was called repeatedly, which may indicate:
- Stuck in a search loop
- Inefficient approach
- Opportunity for better tooling

### Usage Patterns
- Which agents you use most
- Which skills get activated
- Daily/weekly activity trends

## Examples

### Quick Retrospective
```bash
# What happened in my last session?
uv run python -m runtime.harness scripts/braintrust_analyze.py --last-session
```

Output:
```
## Session Analysis
**ID:** `92940b91...`
**Started:** 2025-12-24T01:31:05Z
**Spans:** 14

### Tool Usage
- Read: 4
- Bash: 2
- Edit: 2
...
```

### Find Loops
```bash
uv run python -m runtime.harness scripts/braintrust_analyze.py --detect-loops
```

### Weekly Review
```bash
uv run python -m runtime.harness scripts/braintrust_analyze.py --weekly-summary
```

## Requirements

- BRAINTRUST_API_KEY in ~/.claude/.env or project .env
- Braintrust tracing enabled (via braintrust-claude-plugin)

Related Skills

braintrust-tracing

422
from vibeeval/vibecosystem

Braintrust tracing for Claude Code - hook architecture, sub-agent correlation, debugging

workflow-router

422
from vibeeval/vibecosystem

Goal-based workflow orchestration - routes tasks to specialist agents based on user goals

wiring

422
from vibeeval/vibecosystem

Wiring Verification

websocket-patterns

422
from vibeeval/vibecosystem

Connection management, room patterns, reconnection strategies, message buffering, and binary protocol design.

visual-verdict

422
from vibeeval/vibecosystem

Screenshot comparison QA for frontend development. Takes a screenshot of the current implementation, scores it across multiple visual dimensions, and returns a structured PASS/REVISE/FAIL verdict with concrete fixes. Use when implementing UI from a design reference or verifying visual correctness.

verification-loop

422
from vibeeval/vibecosystem

Comprehensive verification system covering build, types, lint, tests, security, and diff review before a PR.

vector-db-patterns

422
from vibeeval/vibecosystem

Embedding strategies, ANN algorithms, hybrid search, RAG chunking strategies, and reranking for semantic search and retrieval.

variant-analysis

422
from vibeeval/vibecosystem

Find similar vulnerabilities across a codebase after discovering one instance. Uses pattern matching, AST search, Semgrep/CodeQL queries, and manual tracing to propagate findings. Adapted from Trail of Bits. Use after finding a bug to check if the same pattern exists elsewhere.

validate-agent

422
from vibeeval/vibecosystem

Validation agent that validates plan tech choices against current best practices

tracing-patterns

422
from vibeeval/vibecosystem

OpenTelemetry setup, span context propagation, sampling strategies, Jaeger queries

tour

422
from vibeeval/vibecosystem

Friendly onboarding tour of Claude Code capabilities for users asking what it can do.

tldr-stats

422
from vibeeval/vibecosystem

Show full session token usage, costs, TLDR savings, and hook activity