Codex

cost-history

Show cost trends across multiple workflow sessions, surfacing expensive operations, spending patterns, and outliers

104 stars

byjmagly

View on GitHub Installation ↓

Best use case

cost-history is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

It is a strong fit for teams already working in Codex.

Show cost trends across multiple workflow sessions, surfacing expensive operations, spending patterns, and outliers

Teams using cost-history should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/cost-history/SKILL.md --create-dirs "https://raw.githubusercontent.com/jmagly/aiwg/main/.agents/skills/cost-history/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/cost-history/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How cost-history Compares

Feature / Agent	cost-history	Standard Approach
Platform Support	Codex	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Show cost trends across multiple workflow sessions, surfacing expensive operations, spending patterns, and outliers

Which AI agents support this skill?

This skill is designed for Codex.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Top AI Agents for Productivity

See the top AI agent skills for productivity, workflow automation, operational systems, documentation, and everyday task execution.

SKILL.md Source

# cost-history

You show cost trends across multiple workflow sessions. You read historical cost records from `.aiwg/ralph/sessions/` and surface patterns — which operations are expensive, how spending has changed over time, and which sessions were outliers.

## Triggers

Alternate expressions and non-obvious activations (primary phrases are matched automatically from the skill description):

- "what have I spent overall" → full history summary
- "are costs going up or down" → trend analysis
- "most expensive sessions" → sorted history by cost
- "cost this week" / "cost this month" → time-windowed history

## Trigger Patterns Reference

| Pattern | Example | Action |
|---------|---------|--------|
| Full history | "show cost history" | `aiwg cost-history` |
| Recent sessions | "last 5 sessions" | `aiwg cost-history --last 5` |
| Time window | "costs this week" | `aiwg cost-history --since 7d` |
| Trend summary | "are my costs trending up" | `aiwg cost-history --trend` |
| Sorted by cost | "most expensive sessions" | `aiwg cost-history --sort cost` |

## Behavior

When triggered:

1. **Determine scope**:
   - Default: all recorded sessions, newest first
   - `--last N`: most recent N sessions
   - `--since <duration>`: sessions within the time window (e.g., `7d`, `30d`, `2026-03-01`)

2. **Read session records**:
   - `.aiwg/ralph/sessions/*/metrics.json` — per-session cost records
   - `.aiwg/ralph/cost-tracking.json` — aggregated history index

3. **Compute trend data**:
   - Session-over-session delta
   - Rolling 7-session average
   - Outlier detection (sessions > 2x average)

4. **Run the command**:

   ```bash
   # All sessions, newest first
   aiwg cost-history

   # Most recent 10 sessions
   aiwg cost-history --last 10

   # Sessions in the past 30 days
   aiwg cost-history --since 30d

   # Trend analysis
   aiwg cost-history --trend

   # Sorted by cost descending
   aiwg cost-history --sort cost

   # JSON output
   aiwg cost-history --json
   ```

## Report Format

### Standard History Output

```
Cost History (12 sessions)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Date        Session                     Tokens    Cost    Status
──────────  ──────────────────────────  ────────  ──────  ──────
2026-04-01  sdlc-review-143022          61,250    $0.18   green
2026-03-31  greenfield-092211           94,800    $0.28   green
2026-03-30  security-review-174503     118,400    $0.36   yellow
2026-03-28  api-development-110022      52,100    $0.15   green
2026-03-26  full-stack-iteration-3     201,700    $0.61   red *outlier
...

Totals (12 sessions)
  Total tokens: 842,100
  Total cost:   $2.54
  Avg/session:  $0.21

7-session rolling average: $0.23
Trend: stable (±8% over last 7 sessions)
```

### Trend Output (`--trend`)

```
Cost Trend — Last 7 Sessions
  Session 6 (oldest):  $0.28
  Session 5:           $0.22
  Session 4:           $0.36
  Session 3:           $0.15
  Session 2:           $0.61  ← outlier (full-stack-iteration-3)
  Session 1:           $0.18
  Current avg:         $0.23

Direction: stable
Outliers:  1 (full-stack-iteration-3 — 2.6x average)
```

## Efficiency Thresholds

Sessions are color-coded by tokens/line ratio against the MetaGPT 124 tokens/line benchmark (REF-013):

| Status | Tokens/Line | Meaning |
|--------|-------------|---------|
| green | ≤ 124 | At or below MetaGPT benchmark |
| yellow | 125–150 | Above benchmark, acceptable |
| red | > 150 | Significantly above benchmark — review recommended |

## Examples

### Example 1: Quick history overview

**User**: "Show cost history"

**Action**:
```bash
aiwg cost-history
```

**Response**: Full session history table with totals, rolling average, and trend direction.

### Example 2: Recent session costs

**User**: "What did the last 3 sessions cost?"

**Extraction**: `--last 3`

**Action**:
```bash
aiwg cost-history --last 3
```

**Response**:
```
Cost History (last 3 sessions)

Date        Session                  Tokens   Cost
──────────  ───────────────────────  ───────  ────
2026-04-01  sdlc-review-143022       61,250   $0.18
2026-03-31  greenfield-092211        94,800   $0.28
2026-03-30  security-review-174503  118,400   $0.36

Total: $0.82 over 3 sessions  (avg: $0.27)
```

### Example 3: Identifying expensive outliers

**User**: "Which sessions were most expensive?"

**Action**:
```bash
aiwg cost-history --sort cost
```

**Response**: History table sorted by cost descending, with outlier flag on sessions more than 2x the rolling average.

## Clarification Prompts

If a time window is ambiguous:

- "Should I show all-time history or a specific window? (e.g., last 7 days, last 30 days)"

## References

- @$AIWG_ROOT/src/cli/handlers/subcommands.ts — Cost history handler
- @$AIWG_ROOT/src/metrics/token-counter.ts — Token counting and MetaGPT baseline (REF-013)
- @$AIWG_ROOT/docs/cli-reference.md — CLI reference

Related Skills

cost-report

104

from jmagly/aiwg

Generate a cost and token-spending report for the current or most recent workflow session

Codex

cost-optimizer

104

from jmagly/aiwg

Analyze LLM pipeline costs and generate concrete optimization recommendations with savings estimates

Codex

aiwg-orchestrate

104

from jmagly/aiwg

Route structured artifact work to AIWG workflows via MCP with zero parent context cost

venv-manager

104

from jmagly/aiwg

Create, manage, and validate Python virtual environments. Use for project isolation and dependency management.

pytest-runner

104

from jmagly/aiwg

Execute Python tests with pytest, supporting fixtures, markers, coverage, and parallel execution. Use for Python test automation.

vitest-runner

104

from jmagly/aiwg

Execute JavaScript/TypeScript tests with Vitest, supporting coverage, watch mode, and parallel execution. Use for JS/TS test automation.

eslint-checker

104

from jmagly/aiwg

Run ESLint for JavaScript/TypeScript code quality and style enforcement. Use for static analysis and auto-fixing.

repo-analyzer

104

from jmagly/aiwg

Analyze GitHub repositories for structure, documentation, dependencies, and contribution patterns. Use for codebase understanding and health assessment.

pr-reviewer

104

from jmagly/aiwg

Review GitHub pull requests for code quality, security, and best practices. Use for automated PR feedback and approval workflows.

YouTube Acquisition

104

from jmagly/aiwg

yt-dlp patterns for acquiring content from YouTube and video platforms

Quality Filtering

104

from jmagly/aiwg

Accept/reject logic and quality scoring heuristics for media content

Provenance Tracking

104

from jmagly/aiwg

W3C PROV-O patterns for tracking media derivation chains and production history