dd-apm

APM - traces, services, dependencies, performance analysis.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

dd-apm is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

APM - traces, services, dependencies, performance analysis.

Teams using dd-apm should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/dd-apm/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/dd-apm/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/dd-apm/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How dd-apm Compares

Feature / Agent	dd-apm	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

APM - traces, services, dependencies, performance analysis.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Datadog APM

Distributed tracing, service maps, and performance analysis.

## Requirements

Datadog Labs Pup should be installed via:

```bash
go install github.com/datadog-labs/pup@latest
```

## Quick Start

```bash
pup auth login
pup apm services list
pup apm traces list --service api-gateway --duration 1h
```

## Services

### List Services

```bash
pup apm services list
pup apm services list --env production
```

### Service Details

```bash
pup apm services get api-gateway --json
```

### Service Map

```bash
# View dependencies
pup apm service-map --service api-gateway --json
```

## Traces

### Search Traces

```bash
# By service
pup apm traces list --service api-gateway --duration 1h

# Errors only
pup apm traces list --service api-gateway --status error

# Slow traces (>1s)
pup apm traces list --service api-gateway --min-duration 1000ms

# With specific tag
pup apm traces list --query "@http.url:/api/users"
```

### Get Trace Detail

```bash
pup apm traces get <trace_id> --json
```

## Key Metrics

| Metric | What It Measures |
|--------|------------------|
| `trace.http.request.hits` | Request count |
| `trace.http.request.duration` | Latency |
| `trace.http.request.errors` | Error count |
| `trace.http.request.apdex` | User satisfaction |

## ⚠️ Trace Sampling

**Not all traces are kept.** Understand sampling:

| Mode | What's Kept |
|------|-------------|
| **Head-based** | Random % at start |
| **Error/Slow** | All errors, slow traces |
| **Retention** | What's indexed (billed) |

```bash
# Check retention filters
pup apm retention-filters list
```

### Trace Retention Costs

| Retention | Cost |
|-----------|------|
| Indexed spans | $$$ per million |
| Ingested spans | $ per million |

**Best practice:** Only index what you need for search.

## Service Level Objectives

Link APM to SLOs:

```bash
pup slos create \
  --name "API Latency p99 < 200ms" \
  --type metric \
  --numerator "sum:trace.http.request.hits{service:api,@duration:<200000000}" \
  --denominator "sum:trace.http.request.hits{service:api}" \
  --target 99.0
```

## Common Queries

| Goal | Query |
|------|-------|
| Slowest endpoints | `avg:trace.http.request.duration{*} by {resource_name}` |
| Error rate | `sum:trace.http.request.errors{*} / sum:trace.http.request.hits{*}` |
| Throughput | `sum:trace.http.request.hits{*}.as_rate()` |

## Troubleshooting

| Problem | Fix |
|---------|-----|
| No traces | Check ddtrace installed, DD_TRACE_ENABLED=true |
| Missing service | Verify DD_SERVICE env var |
| Traces not linked | Check trace headers propagated |
| High cardinality | Don't tag with user_id/request_id |

## References/Docs

- [APM Setup](https://docs.datadoghq.com/tracing/)
- [Trace Search](https://docs.datadoghq.com/tracing/trace_explorer/)
- [Retention Filters](https://docs.datadoghq.com/tracing/trace_pipeline/trace_retention/)

Related Skills

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

developing-frontend-apps

from diegosouzapw/awesome-omni-skill

Frontend application development best practices. Use when building, modifying, or reviewing frontend applications, React components, UI components, client-side JavaScript/TypeScript, CSS/styling, single-page applications, or web application architecture.

developing-claude-agent-sdk-agents

from diegosouzapw/awesome-omni-skill

Build AI agents with the Claude Agent SDK (TypeScript/Python). Covers creating agents, custom tools, hooks, subagents, MCP integration, permissions, sessions, and deployment. Use when building, reviewing, debugging, or deploying SDK-based agents. Invoke PROACTIVELY when user mentions Agent SDK, claude-agent-sdk, ClaudeSDKClient, query(), or building autonomous agents.

developing-backend-services

from diegosouzapw/awesome-omni-skill

Backend service development best practices. Use when designing, building, or reviewing backend services, REST APIs, gRPC services, microservices, webhooks, message queues, or server-side applications regardless of language or framework.

dev_standards_skill

from diegosouzapw/awesome-omni-skill

Development standards and architecture management skill. Enforces modular design, low coupling, clean code practices, and maintains project architecture graph for quick context understanding. Language-agnostic, works with TypeScript, Python, Go, Rust, Java, and more. Use when starting development tasks, refactoring, or analyzing project structure.

dev.shortcuts

from diegosouzapw/awesome-omni-skill

Mandatory shortcut trigger and usage guidance. ALWAYS check if shortcut applies before responding to ANY coding or development request.

dev-workflow-planning

from diegosouzapw/awesome-omni-skill

Structured development workflows using /brainstorm, /write-plan, and /execute-plan patterns. Transform ad-hoc conversations into systematic project execution with hypothesis-driven planning, incremental implementation, and progress tracking.

dev-swarm-tech-specs

from diegosouzapw/awesome-omni-skill

Define technical specifications including tech stack, security, theme standards (from UX mockup), coding standards, and testing standards. Use when user asks to define tech specs, choose tech stack, or start Stage 7 after architecture.

dev-swarm-stage-architecture

from diegosouzapw/awesome-omni-skill

Design the complete system architecture including components, data flow, infrastructure, database schema, and API design. Use when starting stage 07 (architecture) or when user asks about system design, tech stack, or database schema.

dev-specialisms:fly-deploy

from diegosouzapw/awesome-omni-skill

Quick MVP deployment to fly.io for JavaScript (Next.js, RedwoodSDK, Express), Rust (Axum, Rocket), Python (FastAPI), and generic Dockerfiles. Use when deploying applications to fly.io, setting up databases (Postgres, volumes, Tigris object storage), managing secrets, configuring custom domains, setting up GitHub Actions workflows, creating review apps for pull requests, or troubleshooting fly.io deployments. Covers complete deployment workflows from initial setup through production.

dev-expert

from diegosouzapw/awesome-omni-skill

Development patterns for React, Vue, Laravel, Next.js, React Native - state management, forms, API integration

dev-coding

from diegosouzapw/awesome-omni-skill

Implement features as a Principal Engineering Developer