databricks-metric-views
Unity Catalog metric views: define, create, query, and manage governed business metrics in YAML. Use when building standardized KPIs, revenue metrics, order analytics, or any reusable business metrics that need consistent definitions across teams and tools.
Best use case
databricks-metric-views is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Unity Catalog metric views: define, create, query, and manage governed business metrics in YAML. Use when building standardized KPIs, revenue metrics, order analytics, or any reusable business metrics that need consistent definitions across teams and tools.
Teams using databricks-metric-views should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/databricks-metric-views/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How databricks-metric-views Compares
| Feature / Agent | databricks-metric-views | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Unity Catalog metric views: define, create, query, and manage governed business metrics in YAML. Use when building standardized KPIs, revenue metrics, order analytics, or any reusable business metrics that need consistent definitions across teams and tools.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Unity Catalog Metric Views
Define reusable, governed business metrics in YAML that separate measure definitions from dimension groupings for flexible querying.
## When to Use
Use this skill when:
- Defining **standardized business metrics** (revenue, order counts, conversion rates)
- Building **KPI layers** shared across dashboards, Genie, and SQL queries
- Creating metrics with **complex aggregations** (ratios, distinct counts, filtered measures)
- Defining **window measures** (moving averages, running totals, period-over-period, YTD)
- Modeling **star or snowflake schemas** with joins in metric definitions
- Enabling **materialization** for pre-computed metric aggregations
## Prerequisites
- **Databricks Runtime 17.2+** (for YAML version 1.1)
- SQL warehouse with `CAN USE` permissions
- `SELECT` on source tables, `CREATE TABLE` + `USE SCHEMA` in the target schema
## Quick Start
### Inspect Source Table Schema
Before creating a metric view, call `get_table_stats_and_schema` to understand available columns for dimensions and measures:
```
get_table_stats_and_schema(
catalog="catalog",
schema="schema",
table_names=["orders"],
table_stat_level="SIMPLE" # Use "DETAILED" for cardinality, min/max, histograms
)
```
### Create a Metric View
```sql
CREATE OR REPLACE VIEW catalog.schema.orders_metrics
WITH METRICS
LANGUAGE YAML
AS $$
version: 1.1
source: catalog.schema.orders
comment: "Orders KPIs for sales analysis"
filter: order_date > '2020-01-01'
dimensions:
- name: Order Month
expr: DATE_TRUNC('MONTH', order_date)
comment: "Month of order"
- name: Order Status
expr: CASE
WHEN status = 'O' THEN 'Open'
WHEN status = 'P' THEN 'Processing'
WHEN status = 'F' THEN 'Fulfilled'
END
comment: "Human-readable order status"
measures:
- name: Order Count
expr: COUNT(1)
- name: Total Revenue
expr: SUM(total_price)
comment: "Sum of total price"
- name: Revenue per Customer
expr: SUM(total_price) / COUNT(DISTINCT customer_id)
comment: "Average revenue per unique customer"
$$
```
### Query a Metric View
All measures must use the `MEASURE()` function. `SELECT *` is NOT supported.
```sql
SELECT
`Order Month`,
`Order Status`,
MEASURE(`Total Revenue`) AS total_revenue,
MEASURE(`Order Count`) AS order_count
FROM catalog.schema.orders_metrics
WHERE extract(year FROM `Order Month`) = 2024
GROUP BY ALL
ORDER BY ALL
```
## Reference Files
| Topic | File | Description |
|-------|------|-------------|
| YAML Syntax | [yaml-reference.md](yaml-reference.md) | Complete YAML spec: dimensions, measures, joins, materialization |
| Patterns & Examples | [patterns.md](patterns.md) | Common patterns: star schema, snowflake, filtered measures, window measures, ratios |
## MCP Tools
Use the `manage_metric_views` tool for all metric view operations:
| Action | Description |
|--------|-------------|
| `create` | Create a metric view with dimensions and measures |
| `alter` | Update a metric view's YAML definition |
| `describe` | Get the full definition and metadata |
| `query` | Query measures grouped by dimensions |
| `drop` | Drop a metric view |
| `grant` | Grant SELECT privileges to users/groups |
### Create via MCP
```python
manage_metric_views(
action="create",
full_name="catalog.schema.orders_metrics",
source="catalog.schema.orders",
or_replace=True,
comment="Orders KPIs for sales analysis",
filter_expr="order_date > '2020-01-01'",
dimensions=[
{"name": "Order Month", "expr": "DATE_TRUNC('MONTH', order_date)", "comment": "Month of order"},
{"name": "Order Status", "expr": "status"},
],
measures=[
{"name": "Order Count", "expr": "COUNT(1)"},
{"name": "Total Revenue", "expr": "SUM(total_price)", "comment": "Sum of total price"},
],
)
```
### Query via MCP
```python
manage_metric_views(
action="query",
full_name="catalog.schema.orders_metrics",
query_measures=["Total Revenue", "Order Count"],
query_dimensions=["Order Month"],
where="extract(year FROM `Order Month`) = 2024",
order_by="ALL",
limit=100,
)
```
### Describe via MCP
```python
manage_metric_views(
action="describe",
full_name="catalog.schema.orders_metrics",
)
```
### Grant Access
```python
manage_metric_views(
action="grant",
full_name="catalog.schema.orders_metrics",
principal="data-consumers",
privileges=["SELECT"],
)
```
## YAML Spec Quick Reference
```yaml
version: 1.1 # Required: "1.1" for DBR 17.2+
source: catalog.schema.table # Required: source table/view
comment: "Description" # Optional: metric view description
filter: column > value # Optional: global WHERE filter
dimensions: # Required: at least one
- name: Display Name # Backtick-quoted in queries
expr: sql_expression # Column ref or SQL transformation
comment: "Description" # Optional (v1.1+)
measures: # Required: at least one
- name: Display Name # Queried via MEASURE(`name`)
expr: AGG_FUNC(column) # Must be an aggregate expression
comment: "Description" # Optional (v1.1+)
joins: # Optional: star/snowflake schema
- name: dim_table
source: catalog.schema.dim_table
on: source.fk = dim_table.pk
materialization: # Optional (experimental)
schedule: every 6 hours
mode: relaxed
```
## Key Concepts
### Dimensions vs Measures
| | Dimensions | Measures |
|---|---|---|
| **Purpose** | Categorize and group data | Aggregate numeric values |
| **Examples** | Region, Date, Status | SUM(revenue), COUNT(orders) |
| **In queries** | Used in SELECT and GROUP BY | Wrapped in `MEASURE()` |
| **SQL expressions** | Any SQL expression | Must use aggregate functions |
### Why Metric Views vs Standard Views?
| Feature | Standard Views | Metric Views |
|---------|---------------|--------------|
| Aggregation locked at creation | Yes | No - flexible at query time |
| Safe re-aggregation of ratios | No | Yes |
| Star/snowflake schema joins | Manual | Declarative in YAML |
| Materialization | Separate MV needed | Built-in |
| AI/BI Genie integration | Limited | Native |
## Common Issues
| Issue | Solution |
|-------|----------|
| **SELECT * not supported** | Must explicitly list dimensions and use MEASURE() for measures |
| **"Cannot resolve column"** | Dimension/measure names with spaces need backtick quoting |
| **JOIN at query time fails** | Joins must be in the YAML definition, not in the SELECT query |
| **MEASURE() required** | All measure references must be wrapped: `MEASURE(\`name\`)` |
| **DBR version error** | Requires Runtime 17.2+ for YAML v1.1, or 16.4+ for v0.1 |
| **Materialization not working** | Requires serverless compute enabled; currently experimental |
## Integrations
Metric views work natively with:
- **AI/BI Dashboards** - Use as datasets for visualizations
- **AI/BI Genie** - Natural language querying of metrics
- **Alerts** - Set threshold-based alerts on measures
- **SQL Editor** - Direct SQL querying with MEASURE()
- **Catalog Explorer UI** - Visual creation and browsing
## Resources
- [Metric Views Documentation](https://docs.databricks.com/en/metric-views/)
- [YAML Syntax Reference](https://docs.databricks.com/en/metric-views/data-modeling/syntax)
- [Joins](https://docs.databricks.com/en/metric-views/data-modeling/joins)
- [Window Measures](https://docs.databricks.com/aws/en/metric-views/data-modeling/window-measures) (Experimental)
- [Materialization](https://docs.databricks.com/en/metric-views/materialization)
- [MEASURE() Function](https://docs.databricks.com/en/sql/language-manual/functions/measure)Related Skills
querying-mlflow-metrics
Fetches aggregated trace metrics (token usage, latency, trace counts, quality evaluations) from MLflow tracking servers. Triggers on requests to show metrics, analyze token usage, view LLM costs, check usage trends, or query trace statistics.
databricks-zerobus-ingest
Build Zerobus Ingest clients for near real-time data ingestion into Databricks Delta tables via gRPC. Use when creating producers that write directly to Unity Catalog tables without a message bus, working with the Zerobus Ingest SDK in Python/Java/Go/TypeScript/Rust, generating Protobuf schemas from UC tables, or implementing stream-based ingestion with ACK handling and retry logic.
databricks-vector-search
Patterns for Databricks Vector Search: create endpoints and indexes, query with filters, manage embeddings. Use when building RAG applications, semantic search, or similarity matching. Covers both storage-optimized and standard endpoints.
databricks-unstructured-pdf-generation
Generate PDF documents from HTML and upload to Unity Catalog volumes. Use for creating test PDFs, demo documents, reports, or evaluation datasets.
databricks-unity-catalog
Unity Catalog system tables and volumes. Use when querying system tables (audit, lineage, billing) or working with volume file operations (upload, download, list files in /Volumes/).
databricks-synthetic-data-gen
Generate realistic synthetic data using Spark + Faker (strongly recommended). Supports serverless execution, multiple output formats (Parquet/JSON/CSV/Delta), and scales from thousands to millions of rows. For small datasets (<10K rows), can optionally generate locally and upload to volumes. Use when user mentions 'synthetic data', 'test data', 'generate data', 'demo dataset', 'Faker', or 'sample data'.
databricks-spark-structured-streaming
Comprehensive guide to Spark Structured Streaming for production workloads. Use when building streaming pipelines, working with Kafka ingestion, implementing Real-Time Mode (RTM), configuring triggers (processingTime, availableNow), handling stateful operations with watermarks, optimizing checkpoints, performing stream-stream or stream-static joins, writing to multiple sinks, or tuning streaming cost and performance.
databricks-spark-declarative-pipelines
Creates, configures, and updates Databricks Lakeflow Spark Declarative Pipelines (SDP/LDP) using serverless compute. Handles data ingestion with streaming tables, materialized views, CDC, SCD Type 2, and Auto Loader ingestion patterns. Use when building data pipelines, working with Delta Live Tables, ingesting streaming data, implementing change data capture, or when the user mentions SDP, LDP, DLT, Lakeflow pipelines, streaming tables, or bronze/silver/gold medallion architectures.
databricks-python-sdk
Databricks development guidance including Python SDK, Databricks Connect, CLI, and REST API. Use when working with databricks-sdk, databricks-connect, or Databricks APIs.
databricks-model-serving
Deploy and query Databricks Model Serving endpoints. Use when (1) deploying MLflow models or AI agents to endpoints, (2) creating ChatAgent/ResponsesAgent agents, (3) integrating UC Functions or Vector Search tools, (4) querying deployed endpoints, (5) checking endpoint status. Covers classical ML models, custom pyfunc, and GenAI agents.
databricks-mlflow-evaluation
MLflow 3 GenAI agent evaluation. Use when writing mlflow.genai.evaluate() code, creating @scorer functions, using built-in scorers (Guidelines, Correctness, Safety, RetrievalGroundedness), building eval datasets from traces, setting up trace ingestion and production monitoring, aligning judges with MemAlign from domain expert feedback, or running optimize_prompts() with GEPA for automated prompt improvement.
databricks-lakebase-provisioned
Patterns and best practices for Lakebase Provisioned (Databricks managed PostgreSQL) for OLTP workloads. Use when creating Lakebase instances, connecting applications or Databricks Apps to PostgreSQL, implementing reverse ETL via synced tables, storing agent or chat memory, or configuring OAuth authentication for Lakebase.