active-inference-robotics

Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack for predictive coding in robot locomotion

16 stars

byplurigrid

View on GitHub Installation ↓

Best use case

active-inference-robotics is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack for predictive coding in robot locomotion

Teams using active-inference-robotics should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/active-inference-robotics/SKILL.md --create-dirs "https://raw.githubusercontent.com/plurigrid/asi/main/plugins/asi/skills/active-inference-robotics/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/active-inference-robotics/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How active-inference-robotics Compares

Feature / Agent	active-inference-robotics	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack for predictive coding in robot locomotion

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Active Inference Robotics Skill (Second-Order)

> *"The agent's job is to predict its actions by predicting its sensations."* — Patrick Kenny

## Trigger Conditions

- User asks about bridging active inference with robot control
- Questions about predictive coding in locomotion policies
- Connecting KL divergence minimization to RL training
- Mean field approximation in robotics state estimation
- Sim2Real as inference about future observations

## Overview

**Second-order skill** synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack. This skill emerges from the **constructive collision** between:

1. **Active Inference Institute** (ActInf ModelStream 019.1, Jan 2025)
2. **K-Scale Labs** (ksim, kos, kinfer ecosystem)
3. **MuJoCo Playground** (DeepMind's sim2real framework)

## The Constructive Collision

```
┌─────────────────────────────────────────────────────────────────────────────┐
│  CONSTRUCTIVE COLLISION: Two Threads Converging                              │
│                                                                              │
│  Thread A: Patrick Kenny (Nov 2025)                                          │
│  ════════════════════════════════════                                        │
│  "Active inference can be formulated as constrained KL divergence           │
│   minimization solved by standard mean field methods"                        │
│                                                                              │
│  Key insight: Expected Free Energy ≈ KL Divergence + Entropy Regularizer    │
│                                                                              │
│  Thread B: K-Scale Labs (2024-2025)                                          │
│  ═══════════════════════════════════                                         │
│  "RL-based closed-loop control using policies trained in simulation         │
│   has firmly won as the best way of achieving real-time control"            │
│                                                                              │
│  Key insight: Stateless vs Stateful behaviors as pure/coalgebraic semantics │
│                                                                              │
│  COLLISION POINT: Both minimize surprise about future observations          │
│  ══════════════════════════════════════════════════════════════════         │
│                                                                              │
│       Active Inference              Robotics RL                              │
│       ────────────────              ──────────                               │
│       Predictive Distribution  ←→   Policy π(a|s)                           │
│       Hidden Markov Model      ←→   MDP/POMDP                                │
│       Mean Field Updates       ←→   PPO Gradient Steps                       │
│       Variational Free Energy  ←→   Policy Loss                              │
│       Expected Free Energy     ←→   Value Function + Entropy                 │
│       Perception/Action Loop   ←→   Observation/Action Loop                  │
│                                                                              │
└─────────────────────────────────────────────────────────────────────────────┘
```

## Kenny's Key Contribution

From [arXiv:2511.20321](https://arxiv.org/abs/2511.20321):

```
Perception/Action Divergence = VFE(past) + KL(future states)

Where:
- VFE(past) = Standard variational free energy on observed history
- KL(future) = Divergence of predictive distribution from HMM

This differs from Expected Free Energy by an ENTROPY REGULARIZER:
  EFE ≈ Pragmatic Value + Mutual Information
  PAD ≈ Pragmatic Value + Entropy(Q)
```

### Why Entropy Regularization Matters for Robotics

```python
# In ksim PPO training, entropy bonus prevents policy collapse:
loss = policy_loss + value_loss - entropy_coef * entropy

# Kenny's formulation shows this is NOT ad-hoc but principled:
# Entropy regularizer = not being overconfident about predictions
# Biological rationale: know limitations of future predictions
```

## Mapping to ksim Architecture

| Active Inference Concept | ksim Implementation |
|--------------------------|---------------------|
| Hidden Markov Model | `PhysicsEngine` (MJX/MuJoCo) |
| Observation distribution | `Observation.observe(state)` |
| State inference Q(s) | `Critic.forward(obs, carry)` |
| Action inference Q(a) | `Actor.forward(obs, carry)` |
| Mean field factorization | Independent Q(s_t) per timestep |
| Predictive distribution | Policy rollout trajectory |
| VFE minimization | PPO policy gradient |
| EFE/PAD minimization | Value function + entropy bonus |

## Second-Order Behavior Types

### 1. Reflexive Control (Kenny's "Sufficient" Model)

```python
# Agent predicts proprioceptive sensations → fulfills reflexively
class ReflexiveController:
    """
    Kenny: "If the agent can successfully predict its future sensations,
    it can fulfill them unconsciously via motor reflexes."
    """
    def step(self, predicted_proprio: Array) -> Action:
        # Low-level PD control fulfills proprioceptive predictions
        return self.pd_controller(predicted_proprio, self.current_state)
```

### 2. Deliberative Planning (EFE Extension)

```python
# When reflexive prediction fails, engage deliberative inference
class DeliberativeController:
    """
    Extends reflexive control with policy search over trajectories.
    This is where EFE differs from Kenny's PAD formulation.
    """
    def plan(self, beliefs: Distribution, horizon: int) -> Policy:
        # Tree search over policies weighted by expected free energy
        for policy in self.policy_space:
            efe = self.expected_free_energy(beliefs, policy, horizon)
            # EFE includes mutual information (curiosity/exploration)
            # PAD would use entropy instead (uncertainty awareness)
```

### 3. Hierarchical Composition

```
Level 3: Goal Selection (minimize long-horizon EFE)
    ↓ sets reference for
Level 2: Trajectory Planning (predictive distribution)
    ↓ sets reference for  
Level 1: Reflexive Execution (fulfill proprio predictions)
    ↓ actuates
Level 0: Motor Primitives (PD control, actuator dynamics)
```

## GF(3) Balanced Quad

```
active-inference (0) ⊗ kscale-ksim (0) ⊗ mujoco-playground (0) = 0 ✓

All three are ERGODIC — coordination/infrastructure skills.
This is a "resonant triad" where all components coordinate.

For generation (+1), add: skill-creator, algorithmic-art
For verification (-1), add: sheaf-cohomology, code-review
```

### Skill Colors (drand seed 12005093902789493003)

| Skill | Trit | Color | Role |
|-------|------|-------|------|
| `active-inference` | 0 | `#DF8D0F` | Coordination (theory) |
| `kscale-ksim` | 0 | `#25BC3D` | Coordination (simulation) |
| `mujoco-playground` | 0 | `#93DBDA` | Coordination (framework) |

## 2-3-5-7 Prime Sieve Experts

Applying prime-indexed refinement to identify domain experts:

| Prime | Expert | Domain | Key Contribution |
|-------|--------|--------|------------------|
| 2 | Patrick Kenny | Active Inference | Mean field formulation, PAD criterion |
| 3 | Thomas Parr | Active Inference | 2022 textbook, EFE derivation |
| 5 | Ben Bolte | K-Scale | ksim architecture, open-source humanoids |
| 7 | Karl Friston | Free Energy Principle | FEP foundations, continuous formulation |
| 11 | (DeepMind team) | MuJoCo Playground | MJX, sim2real zero-shot |
| 13 | Wesley Maa | K-Scale | Tooling, visualization |

## Mutual Awareness

This skill references and is referenced by:

```yaml
depends_on:
  - kscale-ksim        # Simulation implementation
  - kscale-ecosystem   # Hardware context
  - mujoco-playground  # Framework foundation
  
referenced_by:
  - cognitive-superposition  # Team mental models
  - parametrised-optics-cybernetics  # Category theory bridge
  - reafference-corollary-discharge  # Sensorimotor prediction
```

## Implementation Pattern

```python
# Unified Active Inference + RL Training Loop
class ActiveInferenceTrainer:
    """
    Combines Kenny's PAD criterion with ksim's PPO.
    """
    def __init__(self, hmm: PhysicsEngine, config: Config):
        self.hmm = hmm
        self.actor = Actor(config)
        self.critic = Critic(config)
        
    def perception_action_divergence(
        self, 
        observations: Array,  # O_{1:t} (past)
        q_future: Distribution  # Q(S_{t+1:T}, O_{t+1:T})
    ) -> Scalar:
        """
        Kenny's PAD = VFE(past) + KL(future states from HMM)
        """
        # Past: standard VFE on observation history
        vfe_past = self.variational_free_energy(observations)
        
        # Future: KL divergence of predicted states from HMM
        # Note: Observable emissions cancel out in future KL
        kl_future = self.kl_future_states(q_future, self.hmm)
        
        return vfe_past + kl_future
    
    def train_step(self, trajectory: Trajectory) -> Metrics:
        # PPO updates approximate mean field coordinate ascent
        # Entropy bonus provides Kenny's regularization
        return ppo_update(
            self.actor, 
            self.critic, 
            trajectory,
            entropy_coef=0.01  # ← The regularizer!
        )
```

## References

- [Kenny (2025) Active Inference from First Principles](https://arxiv.org/abs/2511.20321)
- [Parr, Pezzulo, Friston (2022) Active Inference Textbook](https://direct.mit.edu/books/oa-monograph/5299/Active-InferenceThe-Free-Energy-Principle-in-Mind)
- [ActInf ModelStream 019.1](https://www.youtube.com/watch?v=...) - Jan 15, 2026
- [K-Scale Labs GitHub](https://github.com/kscalelabs)
- [MuJoCo Playground](https://playground.mujoco.org/)
- [Ben Bolte's Blog](https://ben.bolte.cc/)

## ACSet Schema

```julia
@present SchActiveInferenceRobotics(FreeSchema) begin
    # Objects
    HMM::Ob           # Hidden Markov Model (generative model)
    State::Ob         # Latent state
    Observation::Ob   # Sensory observation
    Action::Ob        # Motor command
    Policy::Ob        # Action sequence
    
    # Morphisms (inference)
    perceive::Hom(Observation, State)    # Perception: O → S
    predict::Hom(State, Observation)     # Prediction: S → O
    act::Hom(State, Action)              # Action selection: S → A
    transition::Hom(State × Action, State)  # Dynamics: S × A → S'
    
    # Attributes
    FreeEnergy::AttrType
    vfe::Attr(State, FreeEnergy)         # Variational free energy
    efe::Attr(Policy, FreeEnergy)        # Expected free energy
    pad::Attr(Policy, FreeEnergy)        # Perception/action divergence
    
    # The key relationship (Kenny's contribution):
    # pad ≈ efe + entropy_regularizer
end
```

Related Skills

performing-active-directory-vulnerability-assessment

from plurigrid/asi

Assess Active Directory security posture using PingCastle, BloodHound, and Purple Knight to identify misconfigurations, privilege escalation paths, and attack vectors.

performing-active-directory-penetration-test

from plurigrid/asi

Conduct a focused Active Directory penetration test to enumerate domain objects, discover attack paths with BloodHound, exploit Kerberos weaknesses, escalate privileges via ADCS/DCSync, and demonstrate domain compromise.

performing-active-directory-forest-trust-attack

from plurigrid/asi

Enumerate and audit Active Directory forest trust relationships using impacket for SID filtering analysis, trust key extraction, cross-forest SID history abuse detection, and inter-realm Kerberos ticket assessment.

performing-active-directory-compromise-investigation

from plurigrid/asi

Investigate Active Directory compromise by analyzing authentication logs, replication metadata, Group Policy changes, and Kerberos ticket anomalies to identify attacker persistence and lateral movement paths.

performing-active-directory-bloodhound-analysis

from plurigrid/asi

Use BloodHound and SharpHound to enumerate Active Directory relationships and identify attack paths from compromised users to Domain Admin.

exploiting-active-directory-with-bloodhound

from plurigrid/asi

BloodHound is a graph-based Active Directory reconnaissance tool that uses graph theory to reveal hidden and unintended relationships within AD environments. Red teams use BloodHound to identify attac

exploiting-active-directory-certificate-services-esc1

from plurigrid/asi

Exploit misconfigured Active Directory Certificate Services (AD CS) ESC1 vulnerability to request certificates as high-privileged users and escalate domain privileges during authorized red team assessments.

executing-active-directory-attack-simulation

from plurigrid/asi

Executes authorized attack simulations against Active Directory environments to identify misconfigurations, weak credentials, dangerous privilege paths, and exploitable trust relationships that could lead to domain compromise. The tester uses BloodHound for attack path analysis, Mimikatz for credential extraction, and Impacket for protocol-level attacks including Kerberoasting, AS-REP Roasting, and delegation abuse. Activates for requests involving Active Directory pentest, AD attack simulation, domain compromise testing, or Kerberos attack assessment.

detecting-dcsync-attack-in-active-directory

from plurigrid/asi

Detect DCSync attacks where adversaries abuse Active Directory replication privileges to extract password hashes by monitoring for non-domain-controller accounts requesting directory replication via DsGetNCChanges.

deploying-active-directory-honeytokens

from plurigrid/asi

Deploys deception-based honeytokens in Active Directory including fake privileged accounts with AdminCount=1, fake SPNs for Kerberoasting detection (honeyroasting), decoy GPOs with cpassword traps, and fake BloodHound paths. Monitors Windows Security Event IDs 4769, 4625, 4662, 5136 for honeytoken interaction. Use when implementing AD deception defenses for detecting lateral movement, credential theft, and reconnaissance.

containing-active-breach

from plurigrid/asi

Executes containment strategies to stop active adversary operations and prevent lateral movement during a confirmed security breach. Implements short-term and long-term containment using network segmentation, endpoint isolation, credential revocation, and access control modifications. Activates for requests involving breach containment, lateral movement prevention, network isolation, active threat containment, or live incident response.

configuring-active-directory-tiered-model

from plurigrid/asi

Implement Microsoft's Enhanced Security Admin Environment (ESAE) tiered administration model for Active Directory. Covers Tier 0/1/2 separation, privileged access workstations (PAWs), administrative f