arize-observability
Arize AI skill for production ML monitoring, embedding drift, and performance analysis.
Best use case
arize-observability is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Arize AI skill for production ML monitoring, embedding drift, and performance analysis.
Teams using arize-observability should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/arize-observability/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How arize-observability Compares
| Feature / Agent | arize-observability | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Arize AI skill for production ML monitoring, embedding drift, and performance analysis.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# arize-observability
## Overview
Arize AI skill for production ML monitoring, embedding drift detection, and comprehensive performance analysis.
## Capabilities
- Production data logging
- Embedding drift detection for NLP/CV models
- Performance monitoring dashboards
- Root cause analysis
- Slice and dice analysis for segments
- Bias monitoring
- A/B test monitoring
- Custom metrics and monitors
## Target Processes
- Model Performance Monitoring and Drift Detection
- ML System Observability and Incident Response
- Model Evaluation and Validation Framework
## Tools and Libraries
- Arize AI SDK
- pandas
- numpy
## Input Schema
```json
{
"type": "object",
"required": ["action"],
"properties": {
"action": {
"type": "string",
"enum": ["log", "monitor", "analyze", "alert-config", "compare"],
"description": "Arize action to perform"
},
"logConfig": {
"type": "object",
"properties": {
"modelId": { "type": "string" },
"modelVersion": { "type": "string" },
"modelType": { "type": "string", "enum": ["score_categorical", "regression", "ranking"] },
"environment": { "type": "string", "enum": ["training", "validation", "production"] },
"dataPath": { "type": "string" },
"predictionIdColumn": { "type": "string" },
"timestampColumn": { "type": "string" },
"featureColumns": { "type": "array", "items": { "type": "string" } },
"embeddingColumns": { "type": "array", "items": { "type": "string" } },
"predictionColumn": { "type": "string" },
"actualColumn": { "type": "string" }
}
},
"monitorConfig": {
"type": "object",
"properties": {
"metrics": { "type": "array", "items": { "type": "string" } },
"thresholds": { "type": "object" },
"schedule": { "type": "string" }
}
},
"analysisConfig": {
"type": "object",
"properties": {
"analysisType": { "type": "string", "enum": ["drift", "performance", "fairness", "data_quality"] },
"timeRange": { "type": "object" },
"segments": { "type": "array", "items": { "type": "string" } }
}
}
}
}
```
## Output Schema
```json
{
"type": "object",
"required": ["status", "action"],
"properties": {
"status": {
"type": "string",
"enum": ["success", "error"]
},
"action": {
"type": "string"
},
"logId": {
"type": "string"
},
"dashboardUrl": {
"type": "string"
},
"analysis": {
"type": "object",
"properties": {
"overallScore": { "type": "number" },
"driftMetrics": { "type": "object" },
"performanceMetrics": { "type": "object" },
"topIssues": { "type": "array" },
"recommendations": { "type": "array", "items": { "type": "string" } }
}
},
"alerts": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"severity": { "type": "string" },
"triggered": { "type": "boolean" }
}
}
}
}
}
```
## Usage Example
```javascript
{
kind: 'skill',
title: 'Log production predictions to Arize',
skill: {
name: 'arize-observability',
context: {
action: 'log',
logConfig: {
modelId: 'fraud-detector',
modelVersion: '2.0.0',
modelType: 'score_categorical',
environment: 'production',
dataPath: 'data/production_predictions.parquet',
predictionIdColumn: 'request_id',
timestampColumn: 'timestamp',
featureColumns: ['amount', 'merchant_category', 'hour'],
predictionColumn: 'fraud_probability',
actualColumn: 'is_fraud'
}
}
}
}
```Related Skills
phoenix-arize-setup
Arize Phoenix observability platform setup for LLM debugging and evaluation
process-builder
Scaffold new babysitter process definitions following SDK patterns, proper structure, and best practices. Guides the 3-phase workflow from research to implementation.
babysitter
Orchestrate via @babysitter. Use this skill when asked to babysit a run, orchestrate a process or whenever it is called explicitly. (babysit, babysitter, orchestrate, orchestrate a run, workflow, etc.)
yolo
Run Babysitter autonomously with minimal manual interruption.
user-install
Install the user-level Babysitter Codex setup.
team-install
Install the team-pinned Babysitter Codex workspace setup.
retrospect
Summarize or retrospect on a completed Babysitter run.
resume
Resume an existing Babysitter run from Codex.
project-install
Install the Babysitter Codex workspace integration into the current project.
plan
Plan a Babysitter workflow without executing the run.
observe
Observe, inspect, or monitor a Babysitter run.
model
Inspect or change Babysitter model-routing policy by phase.