retrieving-mlflow-traces

Retrieves MLflow traces using CLI or Python API. Use when the user asks to get a trace by ID, find traces, filter traces by status/tags/metadata/execution time, query traces, or debug failed traces. Triggers on "get trace", "search traces", "find failed traces", "filter traces by", "traces slower than", "query MLflow traces".

38 stars

bymsbaek

View on GitHub Installation ↓

Best use case

retrieving-mlflow-traces is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using retrieving-mlflow-traces should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/retrieving-mlflow-traces/SKILL.md --create-dirs "https://raw.githubusercontent.com/msbaek/dotfiles/main/.claude/skills/retrieving-mlflow-traces/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/retrieving-mlflow-traces/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How retrieving-mlflow-traces Compares

Feature / Agent	retrieving-mlflow-traces	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Retrieving MLflow Traces

## Single Fetch vs Search

Choose the right approach based on what you have:

| You have... | Use | Command |
|-------------|-----|---------|
| **Trace ID** | Single fetch | `mlflow traces get --trace-id <id>` |
| **Session/user/filters** | Search | `mlflow traces search --experiment-id <id> --filter-string "..."` |

**Single fetch** - Use when you have a specific trace ID (e.g., from UI, logs, or API response):
```bash
mlflow traces get --trace-id tr-69f72a3772570019f2f91b75b8b5ded9
```

**Search** - Use when you need to find traces by criteria (session, user, status, time range, etc.):
```bash
mlflow traces search --experiment-id 1 --filter-string "metadata.\`mlflow.trace.session\` = 'session_abc'"
```

---

## Trace Data Structure

- **TraceInfo**: `trace_id`, `status` (OK/ERROR), `timestamp_ms`, `execution_time_ms`, `tags`, `metadata`, `assessments` (human feedback, evaluation results)
- **Spans**: Tree of operations with `name`, `type`, `attributes`, `start_time`, `end_time`

## Workflow

1. **Check CLI usage** (required): `mlflow traces search --help`
2. **Build filter query** using syntax below
3. **Execute search** with appropriate flags
4. **Retrieve details** for specific traces if needed

## Prerequisite: Check CLI Usage

```bash
mlflow traces search --help
```

Always run this first to get accurate flags for the installed MLflow version.

## Searching Traces

The `mlflow traces search` command is used to search for traces in an MLflow experiment.

### By Run ID

Filter traces associated with a specific MLflow run:

```bash
# All traces for a run
mlflow traces search --run-id <run_id>

# Failed traces for a run
mlflow traces search --run-id <run_id> --filter-string "trace.status = 'ERROR'"

# Can combine with experiment-id
mlflow traces search --experiment-id 1 --run-id <run_id>
```

### By Session or User (Common for Debugging)

When debugging an issue from the MLflow UI, filter by session or user ID to get all related traces:

```bash
# All traces for a specific session (use backticks for special characters in key)
mlflow traces search --experiment-id 1 --filter-string "metadata.\`mlflow.trace.session\` = 'session_abc123'"

# All traces for a specific user
mlflow traces search --experiment-id 1 --filter-string "metadata.\`mlflow.trace.user\` = 'user_456'"

# Failed traces in a session (for root cause analysis)
mlflow traces search --experiment-id 1 --filter-string "metadata.\`mlflow.trace.session\` = 'session_abc123' AND trace.status = 'ERROR'"

# Session traces ordered by time (to see sequence of events)
mlflow traces search --experiment-id 1 --filter-string "metadata.\`mlflow.trace.session\` = 'session_abc123'" --order-by "timestamp_ms ASC"
```

### By Status

```bash
mlflow traces search --experiment-id 1 --filter-string "trace.status = 'ERROR'"
mlflow traces search --experiment-id 1 --filter-string "trace.status = 'OK'"
```

### By Time Range

```bash
# Timestamps are in milliseconds since epoch
# Get current time in ms: $(date +%s)000
# Last hour: $(( $(date +%s)000 - 3600000 ))

mlflow traces search --experiment-id 1 --filter-string "trace.timestamp_ms > $(( $(date +%s)000 - 3600000 ))"
```

### By Execution Time (Slow Traces)

```bash
# Traces slower than 1 second
mlflow traces search --experiment-id 1 --filter-string "trace.execution_time_ms > 1000"
```

### By Tags and Metadata

```bash
# By tag
mlflow traces search --experiment-id 1 --filter-string "tag.environment = 'production'"

# By metadata
mlflow traces search --experiment-id 1 --filter-string "metadata.user_id = 'user_123'"

# Escape special characters in key names with backticks
mlflow traces search --experiment-id 1 --filter-string "tag.\`model-name\` = 'gpt-4'"
mlflow traces search --experiment-id 1 --filter-string "metadata.\`user.id\` = 'abc'"
```

### By Assessment/Feedback

```bash
mlflow traces search --experiment-id 1 --filter-string "feedback.rating = 'positive'"
```

### Full Text Search

```bash
mlflow traces search --experiment-id 1 --filter-string "trace.text LIKE '%error%'"
```

### Pagination

Control result count and iterate through pages:

```bash
# Limit results per page
mlflow traces search --experiment-id 1 --max-results 50

# Output includes "Next page token: <token>" if more results exist
# Use --page-token to fetch next page
mlflow traces search --experiment-id 1 --max-results 50 --page-token "eyJvZmZzZXQiOiA1MH0="
```

### Output Options

```bash
# Output format (table or json)
mlflow traces search --experiment-id 1 --output json

# Include span details in output
mlflow traces search --experiment-id 1 --include-spans

# Order results
mlflow traces search --experiment-id 1 --order-by "timestamp_ms DESC"
```

## Retrieving Single Trace

When you need to retrieve details about a specific trace, use the `mlflow traces get` command.

```bash
mlflow traces get --trace-id <trace_id>
```

## Filter Syntax

For detailed syntax, fetch from documentation:
```
WebFetch(
  url: "https://mlflow.org/docs/latest/genai/tracing/search-traces.md",
  prompt: "Extract the filter syntax table showing supported fields, operators, and examples."
)
```

**Common filters:**
- `trace.status`: OK, ERROR, IN_PROGRESS
- `trace.execution_time_ms`, `trace.timestamp_ms`: numeric comparison
- `metadata.\`mlflow.trace.session\``, `metadata.\`mlflow.trace.user\``: session/user filtering
- `tag.<key>`, `metadata.<key>`: exact match or pattern
- `span.name`, `span.type`: exact match or pattern
- `feedback.<name>`, `expectation.<name>`: assessments

**Pattern operators:** `LIKE`, `ILIKE` (case-insensitive), `RLIKE` (regex)

## Python API

For `mlflow.search_traces()`, see: https://mlflow.org/docs/latest/genai/tracing/search-traces.md

Related Skills

searching-mlflow-docs

from msbaek/dotfiles

Searches and retrieves MLflow documentation from the official docs site. Use when the user asks about MLflow features, APIs, integrations (LangGraph, LangChain, OpenAI, etc.), tracing, tracking, or requests to look up MLflow documentation. Triggers on "how do I use MLflow with X", "find MLflow docs for Y", "MLflow API for Z".

querying-mlflow-metrics

from msbaek/dotfiles

Fetches aggregated trace metrics (token usage, latency, trace counts, quality evaluations) from MLflow tracking servers. Triggers on requests to show metrics, analyze token usage, view LLM costs, check usage trends, or query trace statistics.

instrumenting-with-mlflow-tracing

from msbaek/dotfiles

Instruments Python and TypeScript code with MLflow Tracing for observability. Must be loaded when setting up tracing as part of any workflow including agent evaluation. Triggers on adding tracing, instrumenting agents/LLM apps, getting started with MLflow tracing, tracing specific frameworks (LangGraph, LangChain, OpenAI, DSPy, CrewAI, AutoGen), or when another skill references tracing setup. Examples - "How do I add tracing?", "Instrument my agent", "Trace my LangChain app", "Set up tracing for evaluation"

databricks-mlflow-evaluation

from msbaek/dotfiles

MLflow 3 GenAI agent evaluation. Use when writing mlflow.genai.evaluate() code, creating @scorer functions, using built-in scorers (Guidelines, Correctness, Safety, RetrievalGroundedness), building eval datasets from traces, setting up trace ingestion and production monitoring, aligning judges with MemAlign from domain expert feedback, or running optimize_prompts() with GEPA for automated prompt improvement.

analyzing-mlflow-trace

from msbaek/dotfiles

Analyzes a single MLflow trace to answer a user query about it. Use when the user provides a trace ID and asks to debug, investigate, find issues, root-cause errors, understand behavior, or analyze quality. Triggers on "analyze this trace", "what went wrong with this trace", "debug trace", "investigate trace", "why did this trace fail", "root cause this trace".

analyzing-mlflow-session

from msbaek/dotfiles

Analyzes an MLflow session — a sequence of traces from a multi-turn chat conversation or interaction. Use when the user asks to debug a chat conversation, review session or chat history, find where a multi-turn chat went wrong, or analyze patterns across turns. Triggers on "analyze this session", "what happened in this conversation", "debug session", "review chat history", "where did this chat go wrong", "session traces", "analyze chat", "debug this chat".

weekly-newsletter

from msbaek/dotfiles

Obsidian vault에서 이번 주(토~금) 작성/수정된 글들을 모아 뉴스레터 생성. 서브 에이전트 기반 병렬 처리로 메인 컨텍스트 절약. 기술적, 리더십적으로 외부에 공유할 만한 내용을 선별하여 정리. "뉴스레터 만들어줘", "이번 주 글 정리해줘", "weekly digest" 등의 요청 시 자동 적용.

vis

from msbaek/dotfiles

Vault Intelligence System (vis) CLI를 활용한 Obsidian vault 시맨틱 검색, 자동 태깅, MOC 생성, 관련 문서 연결, 주제별 문서 연결, 주제 수집, 태그 통계, 지식 공백 분석, 중복 감지, 학습 리뷰 등 vault 지식 관리 전반을 지원하는 skill. vault 검색, 문서 정리, 태그, MOC, 관련 문서, 주제 수집, 중복 검사, 학습 리뷰, 지식 공백, 클러스터링, 인덱싱, 주제별 문서 연결, 태그 통계 관련 작업 시 자동 적용.

spark-python-data-source

from msbaek/dotfiles

Build custom Python data sources for Apache Spark using the PySpark DataSource API — batch and streaming readers/writers for external systems. Use this skill whenever someone wants to connect Spark to an external system (database, API, message queue, custom protocol), build a Spark connector or plugin in Python, implement a DataSourceReader or DataSourceWriter, pull data from or push data to a system via Spark, or work with the PySpark DataSource API in any way. Even if they just say "read from X in Spark" or "write DataFrame to Y" and there's no native connector, this skill applies.

session-handoff

from msbaek/dotfiles

세션 종료 시 plan/INDEX/메모리/저널을 업데이트하고 다음 세션 재개 프롬프트 제공

recall

from msbaek/dotfiles

Load context from vault memory. Temporal queries (yesterday, last week, session history) use agf (history.jsonl) for fast session lookup. Topic queries use vis semantic search. "recall graph" generates interactive temporal graph of sessions and files. Every recall ends with "One Thing" - the single highest-leverage next action synthesized from results. Use when user says "recall", "what did we work on", "load context about", "remember when we", "prime context", "yesterday", "what was I doing", "last week", "session history", "recall graph", "session graph".

vercel-react-best-practices

from msbaek/dotfiles

React and Next.js performance optimization guidelines from Vercel Engineering. This skill should be used when writing, reviewing, or refactoring React/Next.js code to ensure optimal performance patterns. Triggers on tasks involving React components, Next.js pages, data fetching, bundle optimization, or performance improvements.