pre-flight-check

[Beta] Session-start scan that surfaces relevant learnings, recent errors, and eval status before work begins. Bridges the outer loop back into the inner loop by making accumulated knowledge visible at task start. Activated via SessionStart hook or manually before major tasks.

6 stars

bypskoett

View on GitHub Installation ↓

Best use case

pre-flight-check is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using pre-flight-check should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/pre-flight-check/SKILL.md --create-dirs "https://raw.githubusercontent.com/pskoett/measuring-ai-proficiency/main/.claude/skills/pre-flight-check/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/pre-flight-check/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How pre-flight-check Compares

Feature / Agent	pre-flight-check	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Pre-Flight Check

Surfaces relevant accumulated knowledge at the start of a session. This is the bridge that connects the outer loop back into the inner loop — it makes prior learnings visible before the agent starts work.

Without this, accumulated `.learnings/` are invisible to new sessions. The agent repeats mistakes that were already captured because nobody told it to look.

## When It Runs

- **Automatically** via SessionStart hook (lightweight scan, ~100-200 tokens)
- **Manually** before major tasks (deep scan with area filtering)

## Hook Output (Automatic — Lightweight)

The SessionStart hook (`scripts/pre-flight.sh`) does a fast scan and outputs a brief reminder if there are relevant signals:

```xml
<pre-flight-check>
Active learnings: N entries in .learnings/
Recent errors (last 7 days): N
Promotion-ready patterns: N
Failed evals: N

High-priority items:
- [Pattern-Key]: [one-line summary] (seen N times)
- [Pattern-Key]: [one-line summary] (seen N times)

Consider running /learning-aggregator if promotion-ready count > 0.
</pre-flight-check>
```

If there are no signals (empty `.learnings/`, no failed evals), the hook outputs nothing — zero overhead.

## Manual Deep Scan

When invoked explicitly, the pre-flight check does a deeper analysis:

### Step 1: Scan .learnings/

Read `.learnings/LEARNINGS.md`, `.learnings/ERRORS.md`, `.learnings/FEATURE_REQUESTS.md`.

For each entry, extract:
- Pattern-Key, Summary, Priority, Status, Area, Related Files, Recurrence-Count, Last-Seen

### Step 2: Scan .evals/ (if exists)

Read `.evals/EVAL_INDEX.md` for any failed or stale evals.

### Step 3: Check Context-Surfing Handoffs

Look for unread files in `.context-surfing/` (same as handoff-checker.sh but integrated).

### Step 4: Relevance Filter

If the user described the task area, filter learnings to:
- Entries whose `Area` matches the task
- Entries whose `Related Files` overlap with likely-touched files
- Entries with `Priority: high/critical` regardless of area
- Entries with `Status: promotion_ready` (need attention)

### Step 5: Output

```markdown
## Pre-Flight Check

### Task Area: [inferred or stated]

### Relevant Learnings
| ID | Summary | Recurrence | Priority | Status |
|----|---------|-----------|----------|--------|
| LRN-... | ... | 3 | high | promotion_ready |
| ERR-... | ... | 2 | medium | pending |

### Key Warnings
- [Pattern-Key]: "Concise warning based on learning" — seen N times, last on YYYY-MM-DD
- [Pattern-Key]: "Concise warning based on learning" — seen N times, last on YYYY-MM-DD

### Failed Evals
| Eval ID | Pattern-Key | Last Failed | Recovery Action |
|---------|------------|-------------|-----------------|
| eval-... | ... | YYYY-MM-DD | ... |

### Handoff Files
- [filename] — from session on YYYY-MM-DD

### Recommendations
- [ ] Read handoff files before starting
- [ ] Run learning-aggregator (N promotion-ready patterns)
- [ ] Fix failed evals before starting new work
- [ ] Watch for [specific pattern] in [area]
```

## Integration

### Upstream (feeds from)
- `.learnings/*.md` — accumulated learning entries from self-improvement
- `.evals/EVAL_INDEX.md` — eval results from eval-creator
- `.context-surfing/` — handoff files from context-surfing

### Downstream (feeds into)
- **Inner loop context** — the agent starts work with awareness of known patterns
- **learning-aggregator** — if promotion-ready count is high, recommend running it
- **eval-creator** — if failed evals exist, recommend fixing before new work

### The Compounding Effect

This is where the blog's compounding happens:

```
Outer loop improves harness → pre-flight surfaces improvements → inner loop starts stronger
```

Every learning captured, every rule promoted, every eval created becomes visible at the next session start. The knowledge gaps get smaller with every cycle.

## Incremental Scanning (future enhancement)

The hook script can be extended to use a local cache file (`.pre-flight-cache.json`) storing last-known state — entry counts, scan date, high-priority items — so the next session start only re-scans entries newer than the cached state. This would enable **delta reporting** ("since your last session, 2 new errors were logged and 1 pattern crossed the promotion threshold") and keep the hook near-instant regardless of how large `.learnings/` grows. Not implemented today — the current hook scans directly on every session start.

## What This Skill Does NOT Do

- Does not modify `.learnings/` files (read-only)
- Does not promote patterns (that's harness-updater)
- Does not run evals (that's eval-creator)
- Does not block execution — it surfaces information, the agent decides what to act on

Related Skills

verify-gate

from pskoett/measuring-ai-proficiency

Runs project compile, test, and lint commands between implementation and quality review. Gates simplify-and-harden behind machine verification. If checks fail, routes back to implementation with diagnostics for a fix loop. If checks pass, signals ready for the quality pass. Use after any implementation work completes and before simplify-and-harden. Essential for the inner loop's verify step.

use-agent-factory

from pskoett/measuring-ai-proficiency

How to drive the 14-workflow agent factory in this repo from a Claude session. Covers: when to use the factory vs. direct edits, how to start the chain, where the human gates are, how to pick an implementer, how to recover from stuck PRs, and all the failure modes learned to date. Use this skill when the user asks you to ship a feature, fix, or refactor through the factory; when they reference an existing issue or PR in the factory chain; when a workflow is stuck or misbehaving; or when you need to file issues or plan files that the factory will pick up. Do NOT use this skill for: single-file scratch edits on an untracked branch, research questions, one-shot script runs, or any work that does not produce a PR to main.

simplify-and-harden

from pskoett/measuring-ai-proficiency

Post-completion self-review for coding agents that runs simplify, harden, and micro-documentation passes on non-trivial code changes. Use when: a coding task is complete in a general agent session and you want a bounded quality and security sweep before signaling done. For CI pipeline execution, use simplify-and-harden-ci.

plan-interview

from pskoett/measuring-ai-proficiency

Ensures alignment between user and Claude during feature/spec planning through a structured interview process. Use this skill when the user invokes /plan-interview before implementing a new feature, refactoring, or any non-trivial implementation task. The skill runs an upfront interview to gather requirements across technical constraints, scope boundaries, risk tolerance, and success criteria before any codebase exploration. Do NOT use this skill for: pure research/exploration tasks, simple bug fixes, or when the user just wants standard planning without the interview process.

measure-ai-proficiency

from pskoett/measuring-ai-proficiency

Assess and improve repository AI coding proficiency and context engineering maturity. Use when users ask about: (1) AI readiness or AI maturity assessment, (2) context engineering quality or improvement, (3) CLAUDE.md, .cursorrules, or copilot-instructions files, (4) measuring how well a repo is prepared for AI coding assistants, (5) recommendations for improving AI collaboration, (6) what context files to add, or (7) comparing their repo to AI proficiency best practices.

learning-aggregator

from pskoett/measuring-ai-proficiency

[Beta] Cross-session analysis of accumulated .learnings/ files. Reads all entries, groups by pattern_key, computes recurrence across sessions, and outputs ranked promotion candidates. This is the outer loop's inspect step — it turns raw learning data into actionable gap reports. Use on a regular cadence (weekly, before major tasks, or at session start for critical projects). Can be invoked manually or scheduled.

intent-framed-agent

from pskoett/measuring-ai-proficiency

Frames coding-agent work sessions with explicit intent capture and drift monitoring. Use when a session transitions from planning/Q&A to implementation for coding tasks, refactors, feature builds, bug fixes, or other multi-step execution where scope drift is a risk.

eval-creator

from pskoett/measuring-ai-proficiency

[Beta] Creates permanent eval cases from promoted learnings and runs regression checks against them. Turns failures into test cases that prevent silent regression. This is the outer loop's regress-test step. Use when a learning is promoted and has a clear pass/fail condition, or on cadence to verify promoted rules still hold.

customize-measurement

from pskoett/measuring-ai-proficiency

Customize AI proficiency measurement for your specific repository through a guided interview. Use when: setting up measure-ai-proficiency for a new repo, adjusting thresholds for your team's size, hiding irrelevant recommendations, or mapping custom file names to standard patterns.

context-surfing

from pskoett/measuring-ai-proficiency

Monitors context window health throughout a session and rides peak context quality for maximum output fidelity. Activates automatically after plan-interview and intent-framed-agent. Stays active through execution and hands off cleanly to simplify-and-harden and self-improvement when the wave completes naturally or exits via handoff. Use this skill whenever a multi-step agent task is underway and session continuity or context drift is a concern. Especially important for long-running tasks, complex refactors, or any work where degraded context would silently corrupt the output. Trigger even if the user doesn't say "context surfing" — if an agent task is running across multiple steps with intent and a plan already established, this skill is live.

Agentic Workflow Creator

from pskoett/measuring-ai-proficiency

Create natural language GitHub Actions workflows using the agentic workflows pattern from GitHub Next.

checking-changes

44152

from streamlit/streamlit

Validates all code changes before committing by running format, lint, type, and unit test checks. Use after making backend (Python) or frontend (TypeScript) changes, before committing or finishing a work session.

Developer ToolsClaude