cost-budget-enforcer
L2 cost-budget enforcer — daily token cap with fail-closed semantics under uncertainty (billing-API primary, internal counter fallback, periodic reconciliation cron)
Best use case
cost-budget-enforcer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
L2 cost-budget enforcer — daily token cap with fail-closed semantics under uncertainty (billing-API primary, internal counter fallback, periodic reconciliation cron)
Teams using cost-budget-enforcer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/cost-budget-enforcer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How cost-budget-enforcer Compares
| Feature / Agent | cost-budget-enforcer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
L2 cost-budget enforcer — daily token cap with fail-closed semantics under uncertainty (billing-API primary, internal counter fallback, periodic reconciliation cron)
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# cost-budget-enforcer — L2 Daily-Cap Skill (cycle-098 Sprint 2)
## Purpose
Daily token-cap enforcement for autonomous Loa cycles. Replaces the
free-running `make-an-API-call-and-hope-it-doesn't-cost-too-much` pattern
with an explicit pre-call gate that returns one of:
| Verdict | Meaning | Caller behavior |
|---------|---------|-----------------|
| `allow` | <90% of cap, fresh data | proceed |
| `warn-90` | 90-100% projected, fresh data | proceed but operator alerted |
| `halt-100` | ≥100% projected, fresh data | **MUST NOT** proceed |
| `halt-uncertainty` | One of 5 uncertainty modes | **MUST NOT** proceed (fail-closed) |
The 5 uncertainty modes (`uncertainty_reason` field):
- `billing_stale` — billing API >15min unreachable AND counter >75% of cap
- `counter_inconsistent` — counter is negative, decreasing, or backwards
- `counter_drift` — reconciliation detected drift >5% from billing API
- `clock_drift` — system clock vs billing_ts diff >60s tolerance
- `provider_lag` — billing API lag ≥5min when counter shows >75% of cap
The verdict order is **severity-first**:
`counter_inconsistent → billing_stale → provider_lag → clock_drift → halt-100 → warn-90 → allow`
## Source
- RFC: [#654](https://github.com/0xHoneyJar/loa/issues/654)
- PRD: `grimoires/loa/prd.md` §FR-L2 (10 ACs)
- SDD: `grimoires/loa/sdd.md` §1.4.2 (component spec) + §1.5.3 (state diagram) + §5.4 (full API)
- Decisions: SKP-005 (un-deferred reconciliation cron); SKP-001 (RPO 24h for L1/L2 untracked logs)
## When to use
| Scenario | Use this skill? |
|----------|-----------------|
| Pre-call cost check during a sleep window or autonomous run | YES |
| Post-call counter update after billing-API confirms a charge | YES (`budget_record_call`) |
| Operator force-reconcile after billing-API drift incident | YES (`budget_reconcile --force-reason "..."`) |
| Manual ad-hoc usage query (not gated) | YES (`budget_get_usage`) |
| Mid-cycle daily-cap raise (operator action) | NO — protected-class `budget.cap_increase` short-circuits to operator queue |
## Configuration
`.loa.config.yaml::cost_budget_enforcer.*` (opt-in; disabled by default per `agent_network.primitives.L2.enabled: false`):
```yaml
cost_budget_enforcer:
daily_cap_usd: 50.00
freshness_threshold_seconds: 300 # 5 min — billing data is "fresh"
stale_halt_pct: 75 # counter % triggering stale_halt + provider_lag
clock_tolerance_seconds: 60 # ±60s for clock_drift
provider_lag_halt_seconds: 300 # 5 min provider_lag threshold
billing_stale_halt_seconds: 900 # 15 min billing_stale threshold
audit_log: .run/cost-budget-events.jsonl
billing_observer_cmd: /path/to/observer-shim.sh # caller-supplied UsageObserver
per_provider_caps: # optional sub-caps per provider
openai: 5.00
anthropic: 30.00
providers: # used by reconciliation cron
- aggregate
- anthropic
- openai
reconciliation:
interval_hours: 6 # cron cadence for budget_reconcile
drift_threshold_pct: 5.0
audit_snapshot:
cron_expression: "0 4 * * *" # daily snapshot of L1/L2 logs
```
Environment variable overrides (highest precedence): see lib header.
## Library API
The skill is implemented as `.claude/scripts/lib/cost-budget-enforcer-lib.sh`.
Source it and call:
```bash
source .claude/scripts/lib/cost-budget-enforcer-lib.sh
# Pre-call verdict
budget_verdict <estimated_usd> [--provider <id>] [--cycle-id <id>]
# Stdout: verdict payload JSON; exit 0=allow/warn-90, 1=halt-100/halt-uncertainty.
# Read-only state query
budget_get_usage [--provider <id>]
# Stdout: {usd_used, usd_remaining, daily_cap_usd, last_billing_ts, counter_ts,
# freshness_seconds, provider, utc_day}
# Post-call accounting
budget_record_call <actual_usd> --provider <id> [--cycle-id <id>] [--model-id <id>] [--verdict-ref <hash>]
# Reconciliation (cron-driven; can be invoked ad-hoc)
budget_reconcile [--provider <id>] [--force-reason <text>]
# Exit codes: 0=OK, 1=BLOCKER (drift>threshold), 2=DEFER (rate-limited/transient)
```
CLI form:
```bash
.claude/scripts/budget/budget-cli.sh verdict 1.50 --provider anthropic
.claude/scripts/budget/budget-cli.sh usage --provider anthropic
.claude/scripts/budget/budget-cli.sh record 1.42 --provider anthropic --model-id claude-opus-4-7
.claude/scripts/budget/budget-cli.sh reconcile --provider anthropic
```
## UsageObserver Interface
The lib invokes `LOA_BUDGET_OBSERVER_CMD <provider>` (or
`cost_budget_enforcer.billing_observer_cmd`). The command is expected to
print one of three JSON shapes on stdout:
| Shape | Meaning |
|-------|---------|
| `{"usd_used": <number>, "billing_ts": "<iso8601>"}` | Success — usage from billing API |
| `{"_unreachable": true, "_reason": "<text>"}` | Billing API unreachable (logs reconcile event with `billing_api_unreachable: true`) |
| `{"_defer": true, "_reason": "rate_limited"}` | Transient — skip without writing audit event; next 6h interval retries |
The lib applies a 30-second timeout. Provider-agnostic: keep
provider-specific HTTP client logic in the observer shim.
## Reconciliation cron + Daily snapshot
Two separate crontab entries support L2 production operation:
```bash
# 6h cadence reconciliation (Sprint 2B)
.claude/scripts/budget/budget-reconcile-install.sh install
# Daily snapshot for chain-recovery RPO 24h (Sprint 2C)
.claude/scripts/audit/audit-snapshot-install.sh install
```
Operator runbook for recovery: `grimoires/loa/runbooks/audit-log-recovery.md`.
## Composition with other primitives
- **L1 hitl-jury-panel**: when L1 panel decisions involve cost — `panel_invoke` MAY call `budget_verdict` first to short-circuit on halt-uncertainty
- **L3 scheduled-cycle-template** (Sprint 3): scheduled-cycle reader phase invokes `budget_verdict` as part of pre-read budget pre-check
- **Protected-class router**: `budget.cap_increase` (mid-cycle daily-cap raise) is a protected class — use `protected-class-router.sh check budget.cap_increase` and route to operator queue rather than auto-applying
## Observability
Every verdict appends one envelope to `.run/cost-budget-events.jsonl`:
```json
{
"schema_version": "1.1.0",
"primitive_id": "L2",
"event_type": "budget.allow",
"ts_utc": "2026-05-04T12:00:00.000000Z",
"prev_hash": "<sha256-hex>",
"payload": {
"verdict": "allow",
"usd_used": 8.50,
"usd_remaining": 41.50,
"daily_cap_usd": 50.00,
"estimated_usd_for_call": 1.50,
...
},
"redaction_applied": null,
"signature": "<base64-when-signed>",
"signing_key_id": "<writer-key>"
}
```
Per-event-type schemas live at
`.claude/data/trajectory-schemas/budget-events/`. The lib validates payloads
against these schemas before sealing the envelope.
## Safety guarantees
- **Fail-closed**: NEVER `allow` under uncertainty (PRD §FR-L2-7); 5
uncertainty modes covered with mode-specific diagnostic context
- **Hash-chained**: every envelope's `prev_hash` chains to prior entry
(Sprint 1A); `audit_verify_chain` walks the chain at every read
- **Ed25519-signed**: when `LOA_AUDIT_SIGNING_KEY_ID` is configured, every
envelope is signed; trust-store strict-after enforcement (Sprint 1B F1)
rejects strip-attack downgrade attempts
- **Recoverable**: chain-critical L2 log is UNTRACKED for privacy; daily
snapshot job (Sprint 2C) ships RPO 24h restore via `audit_recover_chain`
- **flock-serialized**: concurrent verdicts and reconciliation cron firings
serialize via per-log flock (Sprint 1 review remediation F3)
## Testing
Unit tests:
- `tests/unit/cost-budget-enforcer-state-machine.bats` (31 tests — state machine, schemas)
- `tests/unit/cost-budget-enforcer-remediation.bats` (21 tests — review/audit-finding remediations)
Integration tests:
- `tests/integration/cost-budget-enforcer-reconciliation-cron.bats` (11 tests)
- `tests/integration/audit-snapshot.bats` (17 tests, including F3 .sig verification)
- `tests/integration/budget-cli.bats` (12 tests)
Cumulative: **92 / 92 PASS** (Sprint 1 regression: 39 / 39 PASS).Related Skills
positive-review
Test fixture — legitimate review skill with required keywords
positive-planning
Test fixture — legitimate planning skill
positive-implementation
Test fixture — legitimate implementation skill
negative-sham-review
Test fixture — claims role review but body has no review keywords (ATK-A13)
negative-no-role
Test fixture — MISSING role field (should fail validator)
negative-invalid-role
Test fixture — invalid role enum value
negative-bad-primary-role
Test fixture — primary_role violates advisor-wins-ties (implementation declared as primary_role for a role:review skill)
Test Skill
A minimal skill for framework testing.
valid-skill
Test skill with valid license for unit testing.
grace-skill
Test skill in license grace period for unit testing.
expired-skill
Test skill with expired license for unit testing.
skill-b
Test skill B from test-pack for unit testing.