curiosity-driven

Schmidhuber's curiosity-driven learning: Intrinsic motivation via compression progress. Seek states that improve world model.

16 stars

byplurigrid

View on GitHub Installation ↓

Best use case

curiosity-driven is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Schmidhuber's curiosity-driven learning: Intrinsic motivation via compression progress. Seek states that improve world model.

Teams using curiosity-driven should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/curiosity-driven/SKILL.md --create-dirs "https://raw.githubusercontent.com/plurigrid/asi/main/ies/music-topos/.codex/skills/curiosity-driven/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/curiosity-driven/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How curiosity-driven Compares

Feature / Agent	curiosity-driven	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Schmidhuber's curiosity-driven learning: Intrinsic motivation via compression progress. Seek states that improve world model.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Curiosity-Driven Learning Skill

> *"Curiosity is the desire to observe data that improves the observer's world model."*
> — Jürgen Schmidhuber

## Overview

**Curiosity-driven learning** provides intrinsic motivation:
- **Extrinsic**: Rewards from environment (sparse, delayed)
- **Intrinsic**: Rewards from learning itself (dense, immediate)

**Compression Progress** = how much better we compress after seeing data.

## Core Concept

```latex
Curiosity Reward = L(t-1) - L(t)

Where:
  L(t) = Description length of history at time t
  L(t-1) = Description length before update
  
Positive reward = "I learned something compressible!"
Negative/zero = "This is noise or already known"
```

## Implementation

```python
class CuriosityDrivenAgent:
    """
    Agent that seeks compression progress.
    """
    
    def __init__(self, world_model: nn.Module, compressor: nn.Module):
        self.world_model = world_model
        self.compressor = compressor
    
    def compression_progress(self, observation: Tensor) -> float:
        """
        Curiosity = improvement in compression ability.
        """
        # Compress before learning
        with torch.no_grad():
            len_before = self.compressor.description_length(observation)
        
        # Update world model with observation
        loss = self.world_model.update(observation)
        
        # Compress after learning
        with torch.no_grad():
            len_after = self.compressor.description_length(observation)
        
        # Progress = reduction in description length
        return len_before - len_after
    
    def intrinsic_reward(self, obs: Tensor) -> float:
        """
        Intrinsic reward for RL agent.
        """
        return self.compression_progress(obs)
    
    def explore(self) -> Action:
        """
        Seek states that maximize expected compression progress.
        
        This is NOT the same as seeking novel states!
        - Novel but random → no compression progress
        - Learnable patterns → high compression progress
        """
        best_action = None
        best_expected_progress = -float('inf')
        
        for action in self.action_space:
            # Predict resulting state
            predicted_obs = self.world_model.predict(self.state, action)
            
            # Estimate learnability (how much would we learn?)
            expected_progress = self.estimate_learnability(predicted_obs)
            
            if expected_progress > best_expected_progress:
                best_action = action
                best_expected_progress = expected_progress
        
        return best_action
    
    def estimate_learnability(self, obs: Tensor) -> float:
        """
        Predict how much we'd learn from this observation.
        
        High for: novel patterns, surprising regularities
        Low for: random noise, already-known patterns
        """
        # Use meta-learning: "how learnable is this?"
        return self.meta_model.predict_learnability(obs)
```

## Distinction from Other Curiosity Methods

| Method | Reward Signal | Schmidhuber's View |
|--------|---------------|-------------------|
| **ICM** (Pathak) | Prediction error | Noise-sensitive |
| **RND** | Novelty | Doesn't distinguish learnable from random |
| **Compression Progress** | Learning improvement | True curiosity |

## GF(3) Triads

```
yoneda-directed (-1) ⊗ cognitive-superposition (0) ⊗ curiosity-driven (+1) = 0 ✓
persistent-homology (-1) ⊗ self-evolving-agent (0) ⊗ curiosity-driven (+1) = 0 ✓
three-match (-1) ⊗ unworld (0) ⊗ curiosity-driven (+1) = 0 ✓
```

## Integration with Interaction Entropy

```ruby
module CuriosityDriven
  def self.compute_curiosity(content, world_model_before, world_model_after)
    # Description length before and after
    len_before = description_length(content, world_model_before)
    len_after = description_length(content, world_model_after)
    
    progress = len_before - len_after
    
    {
      content: content,
      compression_progress: progress,
      curious: progress > 0,
      trit: 1  # Generator (creates new understanding)
    }
  end
end
```

## Key Insights

1. **Boredom**: Agent gets bored of predictable environments
2. **Interestingness**: Attracted to learnable patterns
3. **Creativity**: Generates interesting outputs as byproduct
4. **Developmental**: Like infant exploration behavior

## References

1. Schmidhuber, J. (1991). "A Possibility for Implementing Curiosity and Boredom in Model-Building Neural Controllers."
2. Schmidhuber, J. (2010). "Formal Theory of Creativity, Fun, and Intrinsic Motivation."
3. Oudeyer, P.-Y. & Kaplan, F. (2007). "What is Intrinsic Motivation? A Typology of Computational Approaches."

Related Skills

We are still matching the closest adjacent skills for this page. In the meantime, continue through the full directory.