Best use case
solution-comparator is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Compare multiple solutions for correctness and performance
Teams using solution-comparator should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/solution-comparator/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How solution-comparator Compares
| Feature / Agent | solution-comparator | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Compare multiple solutions for correctness and performance
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Solution Comparator Skill
## Purpose
Compare multiple algorithm solutions against the same test cases to verify correctness and benchmark performance.
## Capabilities
- Run solutions against same test cases
- Performance benchmarking and comparison
- Output diff analysis
- Find minimal failing test case
- Memory usage comparison
- Time complexity validation
## Target Processes
- correctness-proof-testing
- complexity-optimization
- upsolving
- algorithm-implementation
## Comparison Modes
1. **Correctness**: Compare outputs against a known-correct solution
2. **Performance**: Benchmark execution time across solutions
3. **Stress Testing**: Run with random large inputs to find discrepancies
4. **Minimal Counter-example**: Binary search to find smallest failing case
## Input Schema
```json
{
"type": "object",
"properties": {
"solutions": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"code": { "type": "string" },
"language": { "type": "string" }
}
}
},
"testCases": { "type": "array" },
"mode": {
"type": "string",
"enum": ["correctness", "performance", "stress", "minimal"]
},
"oracleSolution": { "type": "string" },
"timeout": { "type": "integer", "default": 5000 }
},
"required": ["solutions", "mode"]
}
```
## Output Schema
```json
{
"type": "object",
"properties": {
"success": { "type": "boolean" },
"results": { "type": "array" },
"discrepancies": { "type": "array" },
"performance": { "type": "object" },
"minimalFailingCase": { "type": "object" }
},
"required": ["success"]
}
```Related Skills
term-comparator
Compares term sheets against market standards, identifies outliers
electre-comparator
ELECTRE family methods skill for outranking-based decision support with concordance and discordance analysis
schema-comparator
Compare database schemas between source and target environments for migration planning
solution-explainer
Generate clear explanations of algorithm solutions
model-profile-resolution
Resolve model profile (quality/balanced/budget) at orchestration start and map agents to specific models. Enables cost/quality tradeoffs by selecting appropriate AI models for each agent role.
process-builder
Scaffold new babysitter process definitions following SDK patterns, proper structure, and best practices. Guides the 3-phase workflow from research to implementation.
babysitter
Orchestrate via @babysitter. Use this skill when asked to babysit a run, orchestrate a process or whenever it is called explicitly. (babysit, babysitter, orchestrate, orchestrate a run, workflow, etc.)
yolo
Run Babysitter autonomously with minimal manual interruption.
user-install
Install the user-level Babysitter Codex setup.
team-install
Install the team-pinned Babysitter Codex workspace setup.
retrospect
Summarize or retrospect on a completed Babysitter run.
resume
Resume an existing Babysitter run from Codex.