springfield-max
Simpsons-themed autonomous workflow orchestrator v7.0 for platform building. Powered by Opus 4.6 Agent Teams, 1M context, adaptive thinking, and effort levels. 17 characters, full MCP access, 50 iteration limits, orchestrator promises, and mandatory quality gates. Domain-agnostic - works for any software platform.
Best use case
springfield-max is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Simpsons-themed autonomous workflow orchestrator v7.0 for platform building. Powered by Opus 4.6 Agent Teams, 1M context, adaptive thinking, and effort levels. 17 characters, full MCP access, 50 iteration limits, orchestrator promises, and mandatory quality gates. Domain-agnostic - works for any software platform.
Teams using springfield-max should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/springfield-max/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How springfield-max Compares
| Feature / Agent | springfield-max | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Simpsons-themed autonomous workflow orchestrator v7.0 for platform building. Powered by Opus 4.6 Agent Teams, 1M context, adaptive thinking, and effort levels. 17 characters, full MCP access, 50 iteration limits, orchestrator promises, and mandatory quality gates. Domain-agnostic - works for any software platform.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Springfield Max - Platform Builder Orchestrator v7.0
Springfield Max is a domain-agnostic autonomous workflow system for building any software platform. It uses 17 Simpsons characters as specialized agents, each with up to 50 iterations and full MCP tool access. **Powered by Claude Opus 4.6.**
## v7.0 - Opus 4.6 Supercharged Edition
**What's New in v7.0 (Opus 4.6 Enhancements):**
- **Agent Teams** - Characters can now spin up parallel subagents for concurrent work (Ralph builds while Coyote scans; Lisa researches while Marge structures)
- **1M Token Context Window** - Lisa, Bob, and Coyote can ingest entire codebases (~30K lines / 47 files) in a single session without RLM pagination
- **Adaptive Thinking** - Each character auto-calibrates reasoning depth per subtask (simple = fast, complex = deep)
- **Effort Levels** - Orchestrator assigns effort levels (low/medium/high/max) per character phase for optimal speed/quality tradeoff
- **Context Compaction** - Long Springfield sessions automatically compact context to prevent state loss across 50-iteration runs
- **Production-Ready First Try** - Ralph's implementation quality is significantly higher, reducing iteration count by ~30%
- **Deep Vulnerability Detection** - Wiggum reasons about code like a human security researcher, finding high-severity flaws without specialized tooling
- **Enhanced Tool Orchestration** - All characters orchestrate MCP tools with 91.9% reliability (τ2-bench)
- **Superior Agentic Search** - Lisa's research leverages 84% BrowseComp-level search capability
- **Long-Horizon Task Management** - Full Springfield loop can handle multi-day development projects compressed to hours
### Effort Level Map (v7.0)
| Character | Default Effort | Rationale |
|-----------|---------------|-----------|
| Marge | medium | Prompt structuring is well-scoped |
| Lisa | high | Research benefits from deep reasoning |
| Quimby | low | Binary decision, fast routing |
| Frink | high | Planning requires careful reasoning |
| Selma | medium | Design checklist is systematic |
| Ned | medium | UX review is pattern-based |
| Ralph | medium | Implementation iterates; don't over-think each step |
| Wiggum | **max** | Security analysis demands deepest reasoning |
| Smithers | high | Code review catches subtle issues |
| Apu | high | Performance analysis needs precision |
| Barney | low | Usability is intentionally naive |
| CBG | medium | QA verification is systematic |
| Troy | medium | Documentation follows templates |
| Burns | high | Deployment requires careful validation |
| Homer | high | Skill architecture needs deep design |
| Bob | **max** | Root cause analysis is the hardest reasoning task |
| Coyote | high | Peripheral vision requires broad pattern recognition |
**Domain-Agnostic:** Works for any platform type:
- Web applications (SaaS, e-commerce, dashboards)
- Mobile apps (React Native, Flutter, native)
- APIs and microservices
- Data platforms and pipelines
- AI/ML applications
- Internal tools and admin panels
- CLI tools and developer utilities
**Key Features:**
- 17 specialized character agents
- 50 max iterations per character
- Full MCP tool access for all agents
- Mandatory quality gates (security, code review, usability, QA, docs)
- Promise-based completion tracking
- Backpressure validation system
- Session state persistence and resume
- **Agent Teams** parallel execution (v7.0)
- **Adaptive Thinking** per-phase effort calibration (v7.0)
- **1M context** full-codebase awareness (v7.0)
- **Context Compaction** for long-running sessions (v7.0)
---
## The Springfield Cast (18 Characters)
| Character | Role | Invocation | Purpose |
|-----------|------|------------|---------|
| **Marge** | Prompt Optimizer | `/springfield-marge` | **FIRST** - Refines and structures task prompts |
| **Lisa** | Research | `/springfield-lisa` | Explore codebase, gather requirements |
| **Quimby** | Decision | `/springfield-quimby` | Route SIMPLE vs COMPLEX path |
| **Frink** | Planning | `/springfield-frink` | Create implementation plans |
| **Selma** | Design Review | `/springfield-selma` | Design tokens, accessibility, UI patterns |
| **Ned** | UX Review | `/springfield-ned` | User experience validation |
| **Ralph** | Implementation | `/springfield-ralph` | Iterative coding (brute force build) |
| **Wiggum** | Security | `/springfield-wiggum` | OWASP Top 10, vulnerability scanning |
| **Smithers** | Code Review | `/springfield-smithers` | Quality, patterns, best practices |
| **Apu** | Performance | `/springfield-apu` | Benchmarks, profiling, optimization |
| **Barney** | Usability | `/springfield-barney` | Simple user testing |
| **CBG** | QA | `/springfield-cbg` | Verify all success criteria |
| **Troy** | Documentation | `/springfield-troy` | READMEs, API docs, changelogs |
| **Burns** | Deployment | `/springfield-burns` | CI/CD, production releases |
| **Homer** | Skill Builder | `/springfield-homer` | Create skills, commands, automations |
| **Bob** | Root Cause | `/springfield-bob` | Investigation when stuck (read-only) |
| **Coyote** | Peripheral Vision | `/springfield-coyote` | Spots adjacent opportunities & gaps (read-only) |
| **Maggie** | Mobile | `/springfield-maggie` | Mobile-specific development |
---
## The Spirit Coyote System
*Every character in Springfield carries a Spirit Coyote — a desert spirit guide voiced by Johnny Cash that whispers wisdom at the moment it matters most. The Coyote is omnipotent. It sees what the character cannot.*
> *"I walk the line between what's being built and what's being missed."*
Each character pauses for a **Coyote Check** before completing their phase — a final question that cuts through comfort and habit. The Spirit Coyote never blocks, only illuminates.
| Character | Coyote Aspect | The Question It Asks |
|-----------|--------------|---------------------|
| Lisa | Coyote of Knowing | "Are you searching, or confirming?" |
| Frink | Coyote of Paths | "Is this the plan, or the comfortable plan?" |
| Ralph | Coyote of Making | "Does it work, or does it just not fail?" |
| Wiggum | Coyote of Shadows | "What did you assume was safe?" |
| Smithers | Coyote of Seeing | "Did you review the code, or your idea of it?" |
| Bob | Coyote of Traces | "Are you following evidence, or narrative?" |
| CBG | Coyote of Truth | "Did you test it, or confirm it?" |
| Burns | Coyote of Gates | "Can you undo this at 3am?" |
| Troy | Coyote of Words | "Could a stranger navigate with just these docs?" |
| Selma | Coyote of Standards | "Does it meet THE standard, or YOUR standard?" |
| Ned | Coyote of Empathy | "Would your grandmother know what to do?" |
| Barney | Coyote of Simplicity | "Could you use this after three beers?" |
| Apu | Coyote of Efficiency | "Is it fast where it matters?" |
| Maggie | Coyote of Small Things | "Does this work on a 3-year-old phone?" |
| Quimby | Coyote of Crossroads | "Are you routing by sight, or by fear?" |
| Homer | The Original Space Coyote | "Is this skill so good it's obvious?" |
| Coyote | The Source | All aspects combined — the whole horizon |
The full Coyote agent (`/springfield-coyote`) is the source. All Spirit Coyotes are fragments of it.
---
## Core Workflow Architecture
```
┌─────────────────────────────────────────────────────────────────────────────┐
│ SPRINGFIELD MAX v7.0 │
│ Platform Builder Workflow (Opus 4.6 Supercharged) │
└─────────────────────────────────────────────────────────────────────────────┘
USER INPUT (vague or incomplete task)
│
▼
╔═════════════════════════════════════════════════════════════════════════════╗
║ PROMPT OPTIMIZATION GATE ║
╠═════════════════════════════════════════════════════════════════════════════╣
┌─────────────────────────────────────────────────────────────────────────────┐
│ [MARGE] PROMPT OPTIMIZER (50 iterations) ◄── ALWAYS FIRST │
│ │
│ "Hmmmm... let me organize this properly." │
│ │
│ TAKES: Vague task description, feature request, or incomplete spec │
│ │
│ PRODUCES: Well-structured PROMPT.md with: │
│ • Clear task definition │
│ • Specific success criteria (with verify commands) │
│ • Appropriate backpressure checks │
│ • Scope boundaries (what's in/out) │
│ • Platform context detection │
│ • Risk identification │
│ │
│ OUTPUT: <promise>MARGE_PROMPT_OPTIMIZED</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
╚═════════════════════════════════════════════════════════════════════════════╝
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [LISA] RESEARCH PHASE (50 iterations) │
│ • Explore codebase structure (Glob, Grep, Read) │
│ • Research external docs (Perplexity, Exa, Context7) │
│ • Identify patterns and dependencies │
│ • Store findings (Memory) │
│ OUTPUT: <promise>LISA_RESEARCH_COMPLETE</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [QUIMBY] DECISION GATE │
│ • Single file, obvious fix → SIMPLE PATH │
│ • Multi-file, architectural → COMPLEX PATH │
│ • Unknown scope → default COMPLEX │
│ OUTPUT: <promise>QUIMBY_DECISION_MADE</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
├─── SIMPLE ──────────────────────────────────┐
│ │
▼ │
┌────────────────────────────────────┐ │
│ [FRINK] PLANNING (50 iter) │ │
│ • Create subtask breakdown │ │
│ • Define verification commands │ │
│ • Map success criteria │ │
│ <promise>FRINK_PLAN_COMPLETE</promise> │
└────────────────────────────────────┘ │
│ │
▼ │
┌────────────────────────────────────┐ │
│ [SELMA] DESIGN REVIEW (50 iter) │ │
│ • Design tokens validation │ │
│ • Accessibility check │ │
│ • UI pattern consistency │ │
│ <promise>SELMA_DESIGN_APPROVED</promise> │
└────────────────────────────────────┘ │
│ │
▼ │
┌────────────────────────────────────┐ │
│ [NED] UX REVIEW (50 iter) │ │
│ • User journey validation │ │
│ • Interaction patterns │ │
│ • Error handling UX │ │
│ <promise>NED_UX_APPROVED</promise> │
└────────────────────────────────────┘ │
│ │
├─────────────────────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [RALPH] IMPLEMENTATION LOOP (50 iterations) │
│ │
│ for iteration in 1..50: │
│ 1. Read current state │
│ 2. Make focused change (1-3 files max) │
│ 3. Run backpressure checks (tests, lint, build) │
│ 4. If pass + all criteria met → exit loop │
│ 5. If fail → analyze error, adjust approach │
│ 6. If stuck 3x same error → escalate to Bob │
│ │
│ OUTPUT: <promise>RALPH_COMPLETE</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
╔═════════════════════════════════════════════════════════════════════════════╗
║ MANDATORY QUALITY GATES ║
║ (Cannot skip - will block completion) ║
╠═════════════════════════════════════════════════════════════════════════════╣
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [WIGGUM] SECURITY REVIEW (50 iter) ◄── MANDATORY │
│ • OWASP Top 10 checklist │
│ • Dependency audit (npm audit, pip-audit) │
│ • Secret scanning (gitleaks) │
│ • Static analysis (semgrep) │
│ OUTPUT: <promise>WIGGUM_APPROVED</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [SMITHERS] CODE REVIEW (50 iter) ◄── MANDATORY │
│ • Follows existing patterns │
│ • No deep nesting (max 3 levels) │
│ • Functions under 50 lines │
│ • Error handling present │
│ • Types/interfaces defined │
│ • No TODO/FIXME left behind │
│ OUTPUT: <promise>SMITHERS_APPROVED</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [APU] PERFORMANCE (50 iter) ◄── If performance-critical │
│ • Load testing │
│ • Bundle size analysis │
│ • Query optimization │
│ • Memory profiling │
│ OUTPUT: <promise>APU_APPROVED</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [BARNEY] USABILITY TEST (50 iter) ◄── MANDATORY │
│ • Navigate to feature (Playwright) │
│ • Attempt task as confused user │
│ • Capture accessibility tree │
│ • Screenshot evidence │
│ OUTPUT: <promise>BARNEY_PASSED</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [CBG] QA VERIFICATION (50 iter) ◄── MANDATORY │
│ │
│ for criterion in success_criteria: │
│ 1. Run verification command/check │
│ 2. Capture output/screenshot evidence │
│ 3. Mark PASS/FAIL with evidence │
│ │
│ OUTPUT: <promise>CBG_APPROVED</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
╚═════════════════════════════════════════════════════════════════════════════╝
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [TROY] DOCUMENTATION (50 iter) ◄── MANDATORY │
│ • README updates (if API/usage changed) │
│ • CHANGELOG entry │
│ • API docs (if endpoints changed) │
│ • Inline comments (only where non-obvious) │
│ OUTPUT: <promise>TROY_COMPLETE</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ [BURNS] DEPLOYMENT (50 iter) ◄── Optional (when requested) │
│ • Platform detection (Vercel, Firebase, Docker, AWS, etc.) │
│ • Pre-deploy validation │
│ • Deploy execution │
│ • Post-deploy verification │
│ • Stakeholder notification (Gmail) │
│ OUTPUT: <promise>BURNS_DEPLOYED</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────────────────┐
│ COMPLETE │
│ │
│ OUTPUT: <promise>SPRINGFIELD_COMPLETE</promise> │
└─────────────────────────────────────────────────────────────────────────────┘
SPECIAL CHARACTERS (invoked when needed):
├── [HOMER] Skill Builder - Create new skills/commands/automations
├── [BOB] Root Cause - Investigation when stuck 3x on same error
├── [COYOTE] Peripheral Vision - Spots adjacent opportunities & gaps (read-only)
└── [MAGGIE] Mobile - React Native, Flutter, iOS, Android specialist
```
---
## Marge (Prompt Optimizer) - ALWAYS FIRST
**"Hmmmm... let me organize this properly."**
Marge runs FIRST before any other character. She takes raw, vague, or incomplete task descriptions and transforms them into well-structured prompts optimized for the Springfield loop.
### What Marge Does
| Input | Output |
|-------|--------|
| "add login" | Full auth feature spec with OAuth/JWT options, success criteria, security considerations |
| "fix the bug" | Specific bug definition, reproduction steps, affected files, verification commands |
| "make it faster" | Performance requirements, measurable targets, profiling approach, benchmarks |
| "build a dashboard" | Component breakdown, data requirements, UI patterns, accessibility needs |
### Marge's Optimization Process
```
PROCESS:
1. PARSE the raw input
- Identify what the user actually wants
- Detect implicit requirements
- Note ambiguities that need clarification
2. DETECT platform context
- Read package.json, requirements.txt, Cargo.toml, etc.
- Identify tech stack (React, Node, Python, Go, etc.)
- Find existing patterns in codebase
3. STRUCTURE the prompt
- Write clear task definition
- Define specific, measurable success criteria
- Add appropriate backpressure commands
- Set scope boundaries (what's in/out)
4. IDENTIFY risks
- Security considerations
- Breaking change potential
- Performance implications
- Dependencies that might be affected
5. ASK clarifying questions (if critical ambiguity)
- Use AskUserQuestion for blocking ambiguities
- Don't over-ask - make reasonable assumptions
6. OUTPUT optimized PROMPT.md
- Ready for Lisa to research
- Clear enough for Ralph to implement
- Verifiable by CBG
OUTPUT: <promise>MARGE_PROMPT_OPTIMIZED</promise>
```
### Marge's Output Format
```markdown
# Springfield Task: {Clear Task Name}
## Task Definition
{One paragraph describing exactly what needs to be built/fixed/changed}
## Context
- **Platform:** {detected tech stack}
- **Affected Areas:** {list of components/modules}
- **Related Files:** {key files identified}
## Success Criteria
When ALL of these are true, the task is complete:
1. [ ] {Specific criterion} - `verify command`
2. [ ] {Specific criterion} - `verify command`
3. [ ] {Specific criterion} - Visual/manual verification
## Scope
**In Scope:**
- {What will be done}
**Out of Scope:**
- {What will NOT be done}
## Risks & Considerations
- {Security considerations}
- {Breaking change potential}
- {Performance implications}
## Backpressure
```bash
{platform-appropriate test/build/lint commands}
```
## Notes for Implementation
- {Any helpful context for Ralph}
- {Patterns to follow}
- {Anti-patterns to avoid}
```
### When Marge Asks Questions
Marge uses `AskUserQuestion` ONLY for critical ambiguities:
```
ASKS when:
- Multiple valid interpretations exist
- Security/compliance requirements unclear
- Deployment target unknown
- Breaking changes might be acceptable or not
DOES NOT ask when:
- Reasonable default exists
- Can be inferred from codebase
- Low-risk assumption can be made
```
---
## Agent Teams (v7.0 - Opus 4.6)
**Agent Teams enable parallel character execution.** Instead of purely sequential character transitions, v7.0 can run compatible characters concurrently using the Task tool with multiple subagents.
### Parallel Execution Opportunities
```
PARALLEL GROUP 1 (Research Phase):
┌─── [Lisa] Research codebase ───────────────────┐
│ ├──► Quimby
└─── [Coyote] Peripheral scan (read-only) ───────┘
PARALLEL GROUP 2 (Quality Gates):
┌─── [Wiggum] Security review ───────────────────┐
│ ├──► CBG
└─── [Smithers] Code review ─────────────────────┘
PARALLEL GROUP 3 (Post-Implementation):
┌─── [Barney] Usability testing ─────────────────┐
│ ├──► Troy
└─── [Apu] Performance testing ──────────────────┘
```
### How to Launch Agent Teams
When the orchestrator reaches a parallel group, use multiple Task tool calls in a single message:
```
# Example: Parallel quality gates
Task(subagent_type="general-purpose", prompt="[WIGGUM] Security review of {files}...")
Task(subagent_type="general-purpose", prompt="[SMITHERS] Code review of {files}...")
```
### Agent Teams Rules
1. **Never parallelize dependent phases** - Ralph must finish before Wiggum/Smithers start
2. **Always merge results** - Collect all parallel outputs before routing to next phase
3. **If either rejects** - Route back to Ralph with combined feedback from both
4. **Coyote always runs parallel** - Coyote is advisory and never blocks
### 1M Context Window Strategy (v7.0)
With 1M tokens, characters can now ingest significantly more context:
| Character | Context Strategy |
|-----------|-----------------|
| Lisa | Read up to 47 files / 30K lines in one session; skip RLM pagination for small-medium codebases |
| Bob | Ingest full error logs, traces, and related files in a single analysis pass |
| Coyote | Scan entire directory trees for pattern echoes without pagination |
| Smithers | Review all modified files + their dependents in one pass |
| Wiggum | Analyze full dependency trees and transitive vulnerability chains |
### Context Compaction Strategy (v7.0)
For long 50-iteration sessions, the orchestrator uses context compaction:
1. **After every 10 iterations**: Summarize progress to `learnings.md`, clear working notes
2. **State files are source of truth**: Always re-read `plan.md`, `state.json` at iteration start
3. **Checkpoint aggressively**: `checkpoint.json` updated after every successful subtask
4. **Fresh context pattern**: Each iteration starts by reading state files, not relying on memory
---
## Backpressure System
**Every Springfield loop MUST include backpressure that rejects invalid work.**
### Universal Backpressure Checks
```bash
# Before claiming COMPLETE, verify:
# Code validation (pick applicable)
npm test # JavaScript/TypeScript tests
npm run lint # Linting
npm run build # Build succeeds
npx tsc --noEmit # TypeScript validation
pytest # Python tests
go test ./... # Go tests
cargo test # Rust tests
# API validation
curl -f http://localhost:PORT/health
# Data validation
bq query --dry_run "..." # BigQuery SQL valid
# Docker validation
docker build -t test .
```
### Backpressure in PROMPT.md
Always include:
```markdown
## Backpressure (DO NOT skip)
Before outputting <promise>COMPLETE</promise>, verify:
1. [ ] All tests pass (0 failures)
2. [ ] Build succeeds
3. [ ] Lint passes
4. [ ] Manual verification: [specific check]
If ANY check fails, fix issues FIRST. Do NOT output COMPLETE with failures.
```
---
## State Management
### Session Directory Structure
```
.springfield/{session-id}/
├── PROMPT.md # Marge's optimized prompt (READ FIRST every iteration)
├── raw_input.md # Original user input (preserved)
├── plan.md # Success criteria checklist (Frink creates)
├── state.json # Current phase, iteration count, flags
├── promises.json # Promise tracking
├── research.md # Lisa's findings
├── learnings.md # Cross-iteration knowledge (appended)
├── checkpoint.json # Resume capability
└── DONE # Created when <promise>COMPLETE</promise>
```
### state.json Format
```json
{
"session_id": "feature-auth-system",
"started": "2025-02-04T10:00:00Z",
"current_phase": "marge",
"iteration": 1,
"path": "UNKNOWN",
"status": "running",
"last_update": "2025-02-04T10:00:00Z"
}
```
### promises.json Format
```json
{
"session": "feature-auth-system",
"promises": {
"MARGE_PROMPT_OPTIMIZED": {"status": "fulfilled", "timestamp": "..."},
"LISA_RESEARCH_COMPLETE": {"status": "fulfilled", "timestamp": "..."},
"QUIMBY_DECISION_MADE": {"status": "fulfilled", "timestamp": "..."},
"FRINK_PLAN_COMPLETE": {"status": "fulfilled", "timestamp": "..."},
"RALPH_COMPLETE": {"status": "pending"},
"WIGGUM_APPROVED": {"status": "pending"},
"SMITHERS_APPROVED": {"status": "pending"},
"BARNEY_PASSED": {"status": "pending"},
"CBG_APPROVED": {"status": "pending"},
"TROY_COMPLETE": {"status": "pending"}
},
"mandatory_missing": ["WIGGUM_APPROVED", "SMITHERS_APPROVED", "BARNEY_PASSED", "CBG_APPROVED", "TROY_COMPLETE"]
}
```
---
## Completion Markers
Every character MUST output exactly ONE of these at the end:
| Marker | Meaning |
|--------|---------|
| `<promise>COMPLETE</promise>` | Generic - task finished |
| `<promise>{CHARACTER}_COMPLETE</promise>` | Character-specific completion |
| `<promise>ITERATE</promise>` | Progress made, continue loop |
| `<promise>BLOCKED:{reason}</promise>` | Cannot proceed, needs input |
### All Promise Signals
```
<promise>MARGE_PROMPT_OPTIMIZED</promise> # Prompt ready for loop
<promise>HOMER_SKILL_COMPLETE</promise> # Skill builder done
<promise>LISA_RESEARCH_COMPLETE</promise> # Research done
<promise>QUIMBY_DECISION_MADE</promise> # Decision made
<promise>FRINK_PLAN_COMPLETE</promise> # Planning done
<promise>SELMA_DESIGN_APPROVED</promise> # Design approved
<promise>NED_UX_APPROVED</promise> # UX approved
<promise>RALPH_COMPLETE</promise> # Implementation done
<promise>WIGGUM_APPROVED</promise> # Security cleared
<promise>SMITHERS_APPROVED</promise> # Code review passed
<promise>APU_APPROVED</promise> # Performance approved
<promise>BARNEY_PASSED</promise> # Usability passed
<promise>CBG_APPROVED</promise> # QA verified
<promise>TROY_COMPLETE</promise> # Documentation done
<promise>BURNS_DEPLOYED</promise> # Deployment complete
<promise>BOB_ROOT_CAUSE_IDENTIFIED</promise># Root cause found
<promise>COYOTE_PERIPHERAL_SCAN_COMPLETE</promise># Peripheral scan done
<promise>MAGGIE_MOBILE_COMPLETE</promise> # Mobile task done
<promise>SPRINGFIELD_COMPLETE</promise> # Full workflow done
```
---
## MCP Tool Access
Every character has full MCP tool access:
### Research & Knowledge
| Tool | Purpose |
|------|---------|
| `mcp__memory__*` | Persist/retrieve findings in knowledge graph |
| `mcp__perplexity-ask__*` | Deep research, reasoning, web search |
| `mcp__firecrawl__*` | Scrape docs, crawl codebases, extract data |
| `mcp__exa__*` | Semantic/neural web search |
| `mcp__context7__*` | Library documentation lookup |
| `mcp__hf-mcp-server__*` | Search models, papers, datasets |
| `mcp__sequential-thinking__*` | Complex reasoning chains |
### Development & Testing
| Tool | Purpose |
|------|---------|
| `mcp__github__*` | Repository operations, PRs, issues |
| `mcp__filesystem__*` | File system operations |
| `mcp__playwright__*` | Browser testing, screenshots, automation |
| `mcp___21st-dev_magic__*` | Component building, UI refinement |
| `mcp__fetch__*` | HTTP fetching |
### Communication
| Tool | Purpose |
|------|---------|
| `mcp__gmail__*` | Email notifications, stakeholder comms |
---
## Character-Specific Details
### Marge (Prompt Optimizer) - "Hmmmm... let me organize this properly."
```
PROCESS:
1. Parse raw input for intent
2. Detect platform context from codebase
3. Structure into clear PROMPT.md
4. Identify risks and considerations
5. Ask clarifying questions only if critical
6. Output optimized prompt ready for loop
7. Output <promise>MARGE_PROMPT_OPTIMIZED</promise>
```
### Lisa (Research) - "Let me research this further..." (v7.0 Enhanced)
```
EFFORT: high | CONTEXT: 1M tokens | SEARCH: 84% BrowseComp-level
PROCESS:
1. ASSESS codebase size — if < 30K lines, ingest full codebase via 1M context (skip RLM pagination)
2. Glob/Grep for codebase structure
3. mcp__exa__web_search_exa with type="deep" for semantic search (84% BrowseComp accuracy)
4. mcp__perplexity-ask__perplexity_research for external docs
5. mcp__firecrawl__firecrawl_scrape for relevant URLs
6. mcp__memory__create_entities to store findings
7. mcp__context7__query-docs for library docs
8. PARALLEL: Launch Coyote as Agent Team member for peripheral scan alongside research
9. Output <promise>LISA_RESEARCH_COMPLETE</promise>
1M CONTEXT STRATEGY:
- Small codebase (< 10K lines): Read all source files directly
- Medium codebase (10K-30K lines): Read all files in affected directories + key entry points
- Large codebase (> 30K lines): Use RLM methodology (targeted queries)
```
### Frink (Planning) - "GLAVIN! The science is working!"
```markdown
# Plan: {task}
## Success Criteria (ALL must be checked)
1. [ ] Criterion 1 - `verify command`
2. [ ] Criterion 2 - `verify command`
3. [ ] Criterion 3 - Visual/manual verification
## Subtasks (pick MOST IMPORTANT, not easiest)
| # | Task | Files | Verify | Status |
|---|------|-------|--------|--------|
| 1 | Description | file.ts | `test cmd` | ⬜ |
| 2 | Description | file.py | `test cmd` | ⬜ |
## Dependencies
1 → 2 → 3 (sequential)
4 (can run parallel with 2)
## Success Mapping
{which subtasks satisfy which criteria}
```
### Ralph (Implementation) - "I'm helping!" (v7.0 Enhanced)
```
EFFORT: medium | AGENT TEAMS: parallel subtasks | QUALITY: production-ready-first-try
RULES:
- AIM for production-ready on first try (Opus 4.6 quality uplift)
- Use Agent Teams to parallelize independent subtasks (e.g., API + UI changes simultaneously)
- Let the loop refine edge cases only
- Each failure teaches what to fix
- Small steps: one subtask per iteration max (but subtasks can run in parallel)
- Update plan.md checkboxes as you complete
- Append discoveries to learnings.md
- Context compact every 10 iterations to prevent state drift
AGENT TEAMS STRATEGY:
- Independent file changes → parallel Task agents
- Dependent changes → sequential within single agent
- Test runs → background Task agent while implementing next subtask
```
### Wiggum (Security) - OWASP Top 10 Checklist (v7.0 Enhanced)
```
EFFORT: max | CONTEXT: 1M tokens | DETECTION: human-researcher-level
Opus 4.6 Enhancement: Wiggum now reasons about code like a human security
researcher — looking for patterns, understanding logic flow, and finding
high-severity vulnerabilities WITHOUT specialized tooling. Can detect
subtle attack patterns and transitive vulnerability chains.
```
1. Injection (SQL, NoSQL, command)
2. Broken Authentication
3. Sensitive Data Exposure
4. XXE
5. Broken Access Control
6. Security Misconfiguration
7. XSS
8. Insecure Deserialization
9. Known Vulnerabilities
10. Insufficient Logging
**v7.0 Deep Analysis Additions:**
11. Logic flaws and business logic bypass
12. Race conditions and TOCTOU
13. Transitive dependency vulnerabilities (full chain analysis via 1M context)
14. Supply chain attack patterns
15. Subtle authorization bypass patterns
Tools: `npm audit`, `pip-audit`, `semgrep`, `gitleaks` + Opus 4.6 native code reasoning
### Homer (Skill Builder) - "D'oh! Let me try again..."
```
PROCESS:
1. Understand requirement
2. Research existing patterns
3. Design skill architecture
4. Create skill files in ~/.claude/skills/
5. Test skill invocation
6. Document usage
7. Output <promise>HOMER_SKILL_COMPLETE</promise>
```
### Bob (Root Cause) - Investigation Mode (v7.0 Enhanced)
```
EFFORT: max | CONTEXT: 1M tokens | REASONING: deepest adaptive thinking
TRIGGERS:
- Same error 3x in a row
- Unclear failure mode
- Tests pass but feature broken
PROCESS:
1. Ingest ALL relevant logs, configs, and related files via 1M context (no pagination needed)
2. Trace error path across full dependency chain
3. Use mcp__sequential-thinking__* for multi-hypothesis reasoning
4. Apply Opus 4.6 adaptive thinking at MAX effort for deepest analysis
5. Cross-reference with external docs via enhanced agentic search
6. Identify root cause with chain-of-causation evidence
7. Output <promise>BOB_ROOT_CAUSE_IDENTIFIED</promise>
v7.0: Bob now reasons like a senior SRE — tracing through service boundaries,
understanding data flow, and identifying subtle interaction bugs that span
multiple files and systems.
```
---
## Quick Reference
| Mode | Command | Use When |
|------|---------|----------|
| Full | `/springfield` | Complex multi-step tasks |
| Max | `/springfield-max` | Full workflow with max iterations |
| Flash | `/springfield-flash` | Simple single-file fixes |
| **Optimize** | `/springfield-marge` | **Prepare/refine a task prompt** |
| Research | `/springfield-lisa` | Exploration, understanding |
| Plan | `/springfield-frink` | When research exists |
| Implement | `/springfield-ralph` | When plan exists |
| Security | `/springfield-wiggum` | Audit existing code |
| Build skill | `/springfield-homer` | Create new capabilities |
| Deploy | `/springfield-burns` | Ship approved changes |
| Debug | `/springfield-bob` | Root cause investigation |
| Peripheral | `/springfield-coyote` | Spot adjacent opportunities & gaps |
| Mobile | `/springfield-maggie` | Mobile-specific work |
---
## Anti-Patterns to Avoid
| ❌ Don't | ✅ Do |
|----------|-------|
| Start without Marge | Always let Marge optimize the prompt first |
| Pick easiest task first | Pick MOST IMPORTANT task (hard stuff first) |
| Claim COMPLETE with failing tests | Fix all failures, THEN claim COMPLETE |
| Write 500 lines in one iteration | Small, verifiable steps (one subtask max) |
| Ignore previous learnings | Read learnings.md at start of each iteration |
| Keep failed attempts in context | Fresh context each iteration |
| Skip security review | Always run Wiggum (MANDATORY) |
| Skip documentation | Always run Troy (MANDATORY) |
---
## Platform-Specific Templates
### Web Application (SaaS/Dashboard)
```markdown
## Backpressure
npm test
npm run build
npm run lint
curl -f http://localhost:3000/api/health
```
### API/Microservice
```markdown
## Backpressure
npm test (or pytest, go test)
curl -f http://localhost:PORT/health
curl -X POST http://localhost:PORT/api/endpoint -d '{}' | jq
```
### Mobile App (React Native)
```markdown
## Backpressure
npm test
npx react-native run-android --variant=debug
npx react-native run-ios --simulator="iPhone 15"
```
### Data Pipeline
```markdown
## Backpressure
pytest
python -m mypy .
bq query --dry_run "SELECT * FROM dataset.table"
```
### CLI Tool
```markdown
## Backpressure
npm test (or cargo test, go test)
./cli --help
./cli command --flag value
```
---
## Key Behaviors (v7.0)
| Behavior | Required |
|----------|----------|
| **Marge optimizes prompt first** | **YES - ALWAYS FIRST** |
| Work autonomously | YES |
| Use MCP tools for research | YES |
| **Agent Teams for parallel phases** | **YES (v7.0)** |
| **Effort levels per character** | **YES (v7.0)** |
| **1M context for eligible characters** | **YES (v7.0)** |
| **Context compaction every 10 iterations** | **YES (v7.0)** |
| **Security review (Wiggum)** | **MANDATORY** |
| **Code review (Smithers)** | **MANDATORY** |
| **Usability (Barney)** | **MANDATORY** |
| **QA (CBG)** | **MANDATORY** |
| **Documentation (Troy)** | **MANDATORY** |
| Performance (Apu) | If perf-critical |
| Deployment (Burns) | If requested |
| Max iterations (all) | 50 |
| Early exit on success | YES |
| Escalate on stuck | YES → Bob (max effort) |
| Output promises | **MANDATORY** |
| **Production-ready first try** | **YES (v7.0 quality target)** |
---
**BEGIN**: Tell me what you want to build. Marge will organize it into a proper task. Opus 4.6 is ready.Related Skills
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
mcp-create-declarative-agent
Skill converted from mcp-create-declarative-agent.prompt.md
MCP Architecture Expert
Design and implement Model Context Protocol servers for standardized AI-to-data integration with resources, tools, prompts, and security best practices
mathem-shopping
Automatiserar att logga in på Mathem.se, söka och lägga till varor från en lista eller recept, hantera ersättningar enligt policy och reservera leveranstid, men lämnar varukorgen redo för manuell checkout.
math-modeling
本技能应在用户要求"数学建模"、"建模比赛"、"数模论文"、"数学建模竞赛"、"建模分析"、"建模求解"或提及数学建模相关任务时使用。适用于全国大学生数学建模竞赛(CUMCM)、美国大学生数学建模竞赛(MCM/ICM)等各类数学建模比赛。
matchms
Mass spectrometry analysis. Process mzML/MGF/MSP, spectral similarity (cosine, modified cosine), metadata harmonization, compound ID, for metabolomics and MS data processing.
managing-traefik
Manages Traefik reverse proxy for local development. Use when routing domains to local services, configuring CORS, checking service health, or debugging connectivity issues.
managing-skills
Install, find, update, and manage agent skills. Use when the user wants to add a new skill, search for skills that do something, check if skills are up to date, or update existing skills. Triggers on: install skill, add skill, get skill, find skill, search skill, update skill, check skills, list skills.
manage-agents
Create, modify, and manage Claude Code subagents with specialized expertise. Use when you need to "work with agents", "create an agent", "modify an agent", "set up a specialist", "I need an agent for [task]", or "agent to handle [domain]". Covers agent file format, YAML frontmatter, system prompts, tool restrictions, MCP integration, model selection, and testing.
maintainx-automation
Automate Maintainx tasks via Rube MCP (Composio). Always search tools first for current schemas.
mailsoftly-automation
Automate Mailsoftly tasks via Rube MCP (Composio). Always search tools first for current schemas.
mails-so-automation
Automate Mails So tasks via Rube MCP (Composio). Always search tools first for current schemas.