self-improving-ai

Understanding and using StickerNest's self-improving AI system. Use when the user asks about AI self-improvement, prompt versioning, reflection loops, AI evaluation, auto-tuning prompts, or the AI judge system. Covers AIReflectionService, stores, and the improvement loop.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

self-improving-ai is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using self-improving-ai should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/self-improving-ai-nymfarious/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/data-ai/self-improving-ai-nymfarious/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/self-improving-ai-nymfarious/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How self-improving-ai Compares

Feature / Agent	self-improving-ai	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Self-Improving AI System for StickerNest

This skill covers StickerNest's self-improving AI system - an AI that evaluates its own generations and automatically improves its prompts over time.

## When to Use This Skill

This skill helps when you need to:
- Understand how the self-improvement loop works
- Configure the reflection system settings
- Add new AI capabilities that should self-improve
- Debug or tune the evaluation rubrics
- Extend the improvement loop to new domains

## Core Concepts

### The Improvement Loop

The self-improving AI follows this cycle:

```
[Generation] → [Track Metrics] → [Evaluate] → [Analyze] → [Improve Prompt] → [Generation]
     ↓              ↓                ↓            ↓              ↓
  Widget/Image   MetricsStore   AIReflection   Suggestions   PromptVersion
                               Service (Judge)               Store
```

### Key Components

| Component | Purpose | Location |
|-----------|---------|----------|
| `AIReflectionStore` | Stores evaluations, runs, suggestions | `src/state/useAIReflectionStore.ts` |
| `PromptVersionStore` | Version control for AI prompts | `src/state/usePromptVersionStore.ts` |
| `GenerationMetricsStore` | Tracks generation quality | `src/state/useGenerationMetricsStore.ts` |
| `AIReflectionService` | The "judge" AI that evaluates | `src/ai/AIReflectionService.ts` |
| `SkillRecommendationService` | Suggests new skills | `src/ai/SkillRecommendationService.ts` |
| `ReflectionDashboard` | Admin UI panel | `src/components/ai-reflection/ReflectionDashboard.tsx` |

### Evaluation Rubrics

The system evaluates generations against rubrics with weighted criteria:

**Widget Generation Rubric:**
- Protocol Compliance (25%) - Follows Widget Protocol v3.0
- Code Quality (20%) - Clean, readable code
- Functionality (25%) - Works correctly
- Port Design (15%) - Good input/output definitions
- User Experience (15%) - Visual design and interaction

**Image Generation Rubric:**
- Prompt Accuracy (30%) - Matches user intent
- Visual Quality (25%) - Clear, well-composed
- Style Consistency (20%) - Matches requested style
- Usability (25%) - Suitable for design use

## Step-by-Step Guide

### Step 1: Recording a Generation

When AI generates something, record it in the metrics store:

```typescript
import { useGenerationMetricsStore } from '../state/useGenerationMetricsStore';

// After generation completes
const metricsStore = useGenerationMetricsStore.getState();
const recordId = metricsStore.addRecord({
  type: 'widget', // or 'image', 'pipeline', 'skill'
  promptVersionId: currentPromptVersionId,
  userPrompt: userInput,
  result: success ? 'success' : 'failure',
  errorMessage: error?.message,
  qualityScore: validationScore, // 0-100 if available
  metadata: {
    model: 'claude-3-5-sonnet',
    provider: 'anthropic',
    durationMs: elapsed,
  },
});
```

### Step 2: Adding User Feedback

Capture user feedback on generations:

```typescript
// Thumbs up/down
metricsStore.addFeedback(recordId, 'thumbs_up');

// Star rating
metricsStore.addFeedback(recordId, 'rating', 4);

// With comment and tags
metricsStore.addFeedback(recordId, 'rating', 2, 'Output was too verbose', ['too_long', 'verbose']);
```

### Step 3: Running a Reflection

Trigger a reflection manually or let it run on schedule:

```typescript
import { reflectOnWidgetGeneration } from '../ai/AIReflectionService';

// Manual reflection
const result = await reflectOnWidgetGeneration({ forceRun: true });

console.log('Evaluation passed:', result.evaluation?.passed);
console.log('Prompt changed:', result.promptChanged);
console.log('New suggestions:', result.suggestions.length);
```

### Step 4: Managing Prompt Versions

Handle prompt version control:

```typescript
import { usePromptVersionStore } from '../state/usePromptVersionStore';

const promptStore = usePromptVersionStore.getState();

// Get current prompt for a domain
const currentPrompt = promptStore.getActivePrompt('widget_generation');

// Create a new version
const versionId = promptStore.createVersion(
  'widget_generation',
  newPromptContent,
  'Improved based on reflection',
  'ai', // created by AI
  evaluationId
);

// Revert to previous version
promptStore.revertToVersion(previousVersionId);

// Handle pending proposals
const proposals = promptStore.getPendingProposals('widget_generation');
proposals.forEach(p => {
  // Review and approve/reject
  promptStore.approveProposal(p.id);
  // or promptStore.rejectProposal(p.id);
});
```

### Step 5: Configuring the Reflection Loop

Adjust reflection settings:

```typescript
import { useAIReflectionStore } from '../state/useAIReflectionStore';

const reflectionStore = useAIReflectionStore.getState();

reflectionStore.updateConfig({
  enabled: true,
  intervalMinutes: 60,        // How often to reflect
  messagesToEvaluate: 20,     // How many records to evaluate
  scoreThreshold: 3.5,        // Pass/fail threshold (1-5)
  cooldownMinutes: 30,        // Pause after prompt update
  autoApplyChanges: false,    // Require approval for changes
  evaluateUnevaluatedOnly: true,
});
```

## Code Examples

### Example: Custom Rubric for New Domain

```typescript
import { useAIReflectionStore, type RubricCriteria } from '../state/useAIReflectionStore';

const customRubric: RubricCriteria[] = [
  {
    name: 'Accuracy',
    description: 'Output matches expected format and content',
    weight: 0.4,
    minScore: 1,
    maxScore: 5,
  },
  {
    name: 'Efficiency',
    description: 'Uses optimal approach without waste',
    weight: 0.3,
    minScore: 1,
    maxScore: 5,
  },
  {
    name: 'Maintainability',
    description: 'Easy to understand and modify',
    weight: 0.3,
    minScore: 1,
    maxScore: 5,
  },
];

const reflectionStore = useAIReflectionStore.getState();
reflectionStore.setWidgetRubric(customRubric);
```

### Example: Tracking Skill Gaps

```typescript
import { analyzeSkillGaps, generateSkillFromGap } from '../ai/SkillRecommendationService';

// Analyze patterns for potential new skills
const gaps = analyzeSkillGaps();

// Find high-priority gaps
const criticalGaps = gaps.filter(g => g.priority === 'critical' || g.priority === 'high');

// Generate a skill template for a gap
if (criticalGaps.length > 0) {
  const template = generateSkillFromGap(criticalGaps[0].id);
  console.log('Suggested skill:', template?.name);
  console.log('Content:', template?.content);
}
```

### Example: Using the Reflection Dashboard

```tsx
import { ReflectionDashboard } from '../components/ai-reflection';
import { useState } from 'react';

function MyComponent() {
  const [showDashboard, setShowDashboard] = useState(false);

  return (
    <>
      <button onClick={() => setShowDashboard(true)}>
        Open AI Dashboard
      </button>
      <ReflectionDashboard
        isOpen={showDashboard}
        onClose={() => setShowDashboard(false)}
      />
    </>
  );
}
```

## Common Patterns

### Pattern: Adding Self-Improvement to a New AI Feature

1. Add a prompt domain to `PromptVersionStore`
2. Track generations in `GenerationMetricsStore`
3. Create a rubric for evaluation
4. Add reflection trigger to `AIReflectionService`

### Pattern: Manual Prompt Improvement

When you want to update a prompt based on observations:

```typescript
const promptStore = usePromptVersionStore.getState();

// Create proposal for review
promptStore.createProposal(
  'widget_generation',
  improvedPromptContent,
  'User requested more concise outputs',
  ['User feedback: too verbose', 'Multiple complaints about length'],
  'manual-review'
);
```

### Pattern: Exporting Data for Analysis

```typescript
const metricsStore = useGenerationMetricsStore.getState();

const analysisData = metricsStore.exportForReflection('widget', {
  limit: 100,
  includeFailuresOnly: true,
});

console.log('Failure rate:', 100 - analysisData.metrics.successRate);
console.log('Common issues:', analysisData.metrics.commonIssues);
```

## Reference Files

| Category | File |
|----------|------|
| **Reflection Store** | `src/state/useAIReflectionStore.ts` |
| **Prompt Versions** | `src/state/usePromptVersionStore.ts` |
| **Generation Metrics** | `src/state/useGenerationMetricsStore.ts` |
| **Reflection Service** | `src/ai/AIReflectionService.ts` |
| **Skill Recommendations** | `src/ai/SkillRecommendationService.ts` |
| **Dashboard UI** | `src/components/ai-reflection/ReflectionDashboard.tsx` |

## Troubleshooting

### Issue: Reflection loop not running
**Cause**: Cooldown period active or no unevaluated records
**Fix**: Check `isInCooldown()` and `getUnevaluatedRecords()`. Use `forceRun: true` to bypass cooldown.

### Issue: Prompts changing too frequently
**Cause**: Score threshold too high or auto-apply enabled
**Fix**: Lower `scoreThreshold`, disable `autoApplyChanges`, increase `cooldownMinutes`

### Issue: AI judge too lenient
**Cause**: Rubric weights favor passing, or system prompt too forgiving
**Fix**: Modify rubric weights, update the `reflection_judge` prompt in `PromptVersionStore`

### Issue: Missing evaluations
**Cause**: Generations not being recorded to metrics store
**Fix**: Ensure `addRecord()` is called after every generation with proper metadata

Related Skills

self-subagent

from diegosouzapw/awesome-omni-skill

Orchestrate parallel sub-tasks by spawning non-interactive instances of your own CLI as subagents. Use when you need to parallelize work across multiple files, run independent investigations simultaneously, or delegate heavy multi-step tasks. Works with ANY AI coding CLI agent (Amp, Claude Code, Codex, Cursor, OpenCode, aider, Cline, Roo, goose, Windsurf, Copilot CLI, pi, etc.). Triggers on "run in parallel", "subagent", "delegate", "fan out", "concurrent tasks", or any complex task that benefits from parallel execution.

repo-a-policy-selftest-gate

from diegosouzapw/awesome-omni-skill

Enforce Repo A DDC policy and acceptance gates before PRs. Use when changing policy files, node runtime behavior, guardrail-sensitive config, or validation tooling that must satisfy AGENTS.md acceptance commands.

Add prerequisite install script for Python deps (self-contained skill)

from diegosouzapw/awesome-omni-skill

No description provided.

alto-self-fix

from diegosouzapw/awesome-omni-skill

Use when ALTO needs to fix itself via GitHub issues. Procedural workflow for running /alto-self-fix or solving issues through ALTO's self-improvement process.

self-improvement

from diegosouzapw/awesome-omni-skill

Zoe's self-improvement system - learns from corrections and user preferences

Agent Self-Correction

from diegosouzapw/awesome-omni-skill

AI agent self-correction mechanisms: error detection, validation loops, recovery strategies, confidence scoring, and iterative refinement

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

skill-coach

from diegosouzapw/awesome-omni-skill

Guides creation of high-quality Agent Skills with domain expertise, anti-pattern detection, and progressive disclosure best practices. Use when creating skills, reviewing existing skills, or when users mention improving skill quality, encoding expertise, or avoiding common AI tooling mistakes. Activate on keywords: create skill, review skill, skill quality, skill best practices, skill anti-patterns. NOT for general coding advice or non-skill Claude Code features.

skild

from diegosouzapw/awesome-omni-skill

Skill package manager for AI Agents — install, manage, and publish Agent Skills.

sitrep-coordinator

from diegosouzapw/awesome-omni-skill

Military-style Situation Report (SITREP) generation for multi-agent coordination. Creates structured status updates with completed/in-progress/blocked sections, authorization codes, handoff protocols, and clear next actions. Optimized for complex project management across multiple AI agents and human operators.

sitespeakai-automation

from diegosouzapw/awesome-omni-skill

Automate Sitespeakai tasks via Rube MCP (Composio). Always search tools first for current schemas.

simulation-dry-run

from diegosouzapw/awesome-omni-skill

How to run scenario tests against Gorlami fork RPCs (dry runs) before broadcasting live transactions. Covers config, seeding balances, runner flags, and safe script patterns.