phoenix-arize-setup

Arize Phoenix observability platform setup for LLM debugging and evaluation

509 stars

Best use case

phoenix-arize-setup is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Arize Phoenix observability platform setup for LLM debugging and evaluation

Teams using phoenix-arize-setup should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/phoenix-arize-setup/SKILL.md --create-dirs "https://raw.githubusercontent.com/a5c-ai/babysitter/main/library/specializations/ai-agents-conversational/skills/phoenix-arize-setup/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/phoenix-arize-setup/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How phoenix-arize-setup Compares

Feature / Agent	phoenix-arize-setup	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Arize Phoenix observability platform setup for LLM debugging and evaluation

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Phoenix Arize Setup Skill

## Capabilities

- Set up Phoenix local server
- Configure tracing instrumentation
- Design evaluation experiments
- Implement embedding visualizations
- Set up retrieval analysis
- Create custom evaluations with LLM-as-judge

## Target Processes

- llm-observability-monitoring
- agent-evaluation-framework

## Implementation Details

### Core Features

1. **Tracing**: OpenTelemetry-based LLM traces
2. **Evals**: LLM-as-judge evaluations
3. **Embeddings**: Visualization and drift detection
4. **Retrieval**: RAG quality analysis
5. **Datasets**: Experiment management

### Instrumentation

- OpenAI auto-instrumentation
- LangChain instrumentation
- LlamaIndex instrumentation
- Custom span creation

### Configuration Options

- Phoenix server setup
- Trace sampling
- Evaluation metrics
- Embedding models
- Export settings

### Best Practices

- Comprehensive instrumentation
- Regular evaluation runs
- Monitor embedding drift
- Analyze retrieval quality

### Dependencies

- arize-phoenix
- openinference-instrumentation-openai

Related Skills

visual-regression-setup

509

from a5c-ai/babysitter

Configure visual regression testing with Percy, Chromatic, or custom screenshot comparison

tauri-project-setup

509

from a5c-ai/babysitter

Initialize Tauri project with Rust backend and frontend framework integration

spectron-test-setup

509

from a5c-ai/babysitter

Set up Spectron (deprecated) tests for legacy Electron application testing

sentry-desktop-setup

509

from a5c-ai/babysitter

Configure Sentry for comprehensive desktop application crash reporting, error monitoring, performance tracking, and release health for Electron and native desktop apps

file-watcher-setup

509

from a5c-ai/babysitter

Set up cross-platform file system watching with debouncing and efficient change detection

electron-protocol-handler-setup

509

from a5c-ai/babysitter

electron-auto-updater-setup

509

from a5c-ai/babysitter

Configure electron-updater with code signing verification, delta updates, staged rollouts, and multiple update channels for Electron applications

avalonia-ui-setup

509

from a5c-ai/babysitter

Set up Avalonia UI project with cross-platform XAML for Windows, macOS, and Linux

arize-observability

509

from a5c-ai/babysitter

Arize AI skill for production ML monitoring, embedding drift, and performance analysis.

viper-go-setup

509

from a5c-ai/babysitter

Set up Viper for Go configuration management with file, env, and flag binding.

plugin-sandbox-setup

509

from a5c-ai/babysitter

Configure plugin sandboxing with vm2 or isolated-vm for secure plugin execution.

mcp-transport-websocket-setup

509

from a5c-ai/babysitter

Configure WebSocket transport for bidirectional MCP communication with connection management and reconnection handling.