checkpoint-mode
Pause for review every N tasks - selective autonomy pattern
Best use case
checkpoint-mode is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Pause for review every N tasks - selective autonomy pattern
Teams using checkpoint-mode should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/checkpoint-mode/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How checkpoint-mode Compares
| Feature / Agent | checkpoint-mode | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Pause for review every N tasks - selective autonomy pattern
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Checkpoint Mode Skill
## Overview
Implements **selective autonomy** - shorter bursts of autonomous work with feedback loops.
**Research Source:** "Use Agents or Be Left Behind" by Tim Dettmers
---
## Philosophy
> "More than 90% of code should be written by agents, but iteratively design systems with shorter bursts of autonomy with feedback loops."
> — Tim Dettmers, 2026
**Problem with Perpetual Autonomy:**
- Can waste resources on wrong approach
- No opportunity for course correction
- User feels disconnected from progress
**Solution:**
- Pause after N tasks or M minutes
- Generate summary of accomplishments
- Wait for explicit approval to continue
---
## When to Use
### Use Checkpoint Mode For:
- **Novel projects** where approach may need adjustment
- **High-cost operations** (expensive API calls, cloud resources)
- **Learning phases** where user wants to guide direction
- **Regulated environments** requiring audit trail
### Use Perpetual Mode For:
- **Well-defined PRDs** with clear requirements
- **Established patterns** with high confidence
- **Overnight builds** where interruption isn't desired
- **CI/CD pipelines** requiring full automation
---
## Configuration
```bash
# Enable checkpoint mode
LOKI_AUTONOMY_MODE=checkpoint
# Pause frequency
LOKI_CHECKPOINT_FREQUENCY=10 # tasks
LOKI_CHECKPOINT_TIME=60 # minutes
# Always pause after these phases
LOKI_CHECKPOINT_PHASES="architecture,deployment"
```
---
## Checkpoint Workflow
```
[Work on 10 tasks] → [Pause] → [Generate Summary] → [Wait for Approval]
↓
[User reviews and approves]
↓
[Resume work]
```
### On Checkpoint:
1. **Generate Summary**
```markdown
# Checkpoint Summary
## Tasks Completed (10)
- Implemented POST /api/todos endpoint
- Added unit tests (95% coverage)
- Set up CI/CD pipeline
- ...
## Next Actions
- Deploy to staging
- Run integration tests
- Security audit
## Resources Used
- 15 minutes elapsed
- 3 Haiku agents, 2 Sonnet agents
- Estimated cost: $0.45
```
2. **Create Approval Signal**
```bash
# System writes:
.loki/signals/CHECKPOINT_SUMMARY_2026-01-14-10-30.md
# User reviews and creates:
.loki/signals/CHECKPOINT_APPROVED
```
3. **Wait for Approval**
- Orchestrator pauses execution
- Monitors for approval signal
- Resumes when signal detected
---
## Agent Instructions (Orchestrator)
When `LOKI_AUTONOMY_MODE=checkpoint`:
```python
completed_tasks = load_completed_tasks()
tasks_since_checkpoint = completed_tasks - last_checkpoint_count
if tasks_since_checkpoint >= CHECKPOINT_FREQUENCY:
# Pause and generate summary
summary = generate_checkpoint_summary()
write_signal("CHECKPOINT_SUMMARY", summary)
# Wait for approval
log_info("Waiting for checkpoint approval...")
while not signal_exists("CHECKPOINT_APPROVED"):
sleep(5)
# Resume work
remove_signal("CHECKPOINT_APPROVED")
log_info("Checkpoint approved. Resuming work...")
last_checkpoint_count = completed_tasks
```
---
## Comparison with Other Modes
| Mode | Best For | Approval Frequency | Use Case |
|------|----------|-------------------|----------|
| **Perpetual** | Overnight builds | Never | Fully automated CI/CD |
| **Checkpoint** | Novel projects | Every 10 tasks | Learning new domain |
| **Supervised** | Critical systems | Every task | Production deployments |
---
## Metrics
Track checkpoint effectiveness:
```json
{
"checkpoint_id": "cp-2026-01-14-001",
"tasks_completed": 10,
"time_elapsed_minutes": 15,
"approval_time_seconds": 45,
"course_corrections": 0,
"user_satisfaction": "approved_without_changes"
}
```
Storage: `.loki/metrics/checkpoint-mode/`
---
## References
- `references/production-patterns.md` - HN production insights
- [timdettmers.com/use-agents-or-be-left-behind](https://timdettmers.com/2026/01/13/use-agents-or-be-left-behind/)
---
**Version:** 1.0.0Related Skills
loki-mode
Launch Loki Mode autonomous SDLC agent. Handles PRD-to-deployment with minimal human intervention. Invoke for multi-phase development tasks, bug fixing campaigns, or full product builds.
prompt-optimization
Applies prompt repetition to improve accuracy for non-reasoning LLMs
modern-javascript-patterns
Comprehensive guide for mastering modern JavaScript (ES6+) features, functional programming patterns, and best practices for writing clean, maintainable, and performant code.
framework-migration-legacy-modernize
Orchestrate a comprehensive legacy system modernization using the strangler fig pattern, enabling gradual replacement of outdated components while maintaining continuous business operations through ex
MCP Engineering — Complete Model Context Protocol System
Build, integrate, secure, and scale MCP servers and clients. From first server to production multi-tool architecture.
Legacy System Modernization Engine
Complete methodology for assessing, planning, and executing legacy system modernization — from monolith decomposition to cloud migration. Works for any tech stack, any scale.
ml-model-eval-benchmark
Compare model candidates using weighted metrics and deterministic ranking outputs. Use for benchmark leaderboards and model promotion decisions.
threat-modeling-expert
Expert in threat modeling methodologies, security architecture review, and risk assessment. Masters STRIDE, PASTA, attack trees, and security requirement extraction. Use PROACTIVELY for security architecture reviews, threat identification, or building secure-by-design systems.
statsmodels
Statsmodels is Python's premier library for statistical modeling, providing tools for estimation, inference, and diagnostics across a wide range of statistical methods.
startup-financial-modeling
Build comprehensive 3-5 year financial models with revenue projections, cost structures, cash flow analysis, and scenario planning for early-stage startups.
react-modernization
Master React version upgrades, class to hooks migration, concurrent features adoption, and codemods for automated transformation.
pydantic-models-py
Create Pydantic models following the multi-model pattern for clean API contracts.