causal-inference
Bengio's causal inference for AI: Interventional reasoning, counterfactuals, and System 2 deep learning. World models with causal structure.
Best use case
causal-inference is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Bengio's causal inference for AI: Interventional reasoning, counterfactuals, and System 2 deep learning. World models with causal structure.
Teams using causal-inference should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/causal-inference/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How causal-inference Compares
| Feature / Agent | causal-inference | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Bengio's causal inference for AI: Interventional reasoning, counterfactuals, and System 2 deep learning. World models with causal structure.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Causal Inference Skill
> *"Current deep learning is System 1: fast, intuitive, but easily fooled. We need System 2: slow, deliberate, causal."*
> — Yoshua Bengio
## Overview
**Causal inference** enables:
1. **Interventional reasoning**: What happens if I *do* X?
2. **Counterfactual reasoning**: What *would have* happened if...?
3. **Transfer**: Causal structure generalizes across domains
4. **Robustness**: Causal models resist distribution shift
## Pearl's Causal Hierarchy
```
Level 3: Counterfactual (Imagining)
"What would have happened if I had done X?"
P(y_x | x', y')
▲
Level 2: Intervention (Doing)
"What happens if I do X?"
P(y | do(X))
▲
Level 1: Association (Seeing)
"What does X tell me about Y?"
P(y | x)
```
## Structural Causal Models (SCM)
```python
class StructuralCausalModel:
"""
SCM: Variables, causal graph, structural equations.
"""
def __init__(self, variables: List[str], graph: DAG, equations: Dict):
self.variables = variables
self.graph = graph # Directed Acyclic Graph
self.equations = equations # X_i = f_i(parents(X_i), U_i)
def intervene(self, intervention: Dict[str, float]) -> "SCM":
"""
do(X = x): Replace equation for X with constant.
This breaks incoming edges to X.
"""
new_equations = self.equations.copy()
for var, value in intervention.items():
new_equations[var] = lambda *_: value
new_graph = self.graph.remove_edges_to(intervention.keys())
return StructuralCausalModel(
self.variables, new_graph, new_equations
)
def counterfactual(self, evidence: Dict, intervention: Dict) -> Dict:
"""
Counterfactual: What would Y be if X had been x, given we observed evidence?
Three steps:
1. Abduction: Infer noise terms from evidence
2. Action: Apply intervention
3. Prediction: Compute counterfactual outcome
"""
# Step 1: Abduction - infer noise terms U
noise_terms = self.abduct_noise(evidence)
# Step 2: Action - apply intervention
intervened_scm = self.intervene(intervention)
# Step 3: Prediction - forward propagate with inferred noise
counterfactual_world = intervened_scm.forward(noise_terms)
return counterfactual_world
class CausalDiscovery:
"""
Learn causal structure from data.
"""
def __init__(self, data: pd.DataFrame):
self.data = data
def pc_algorithm(self) -> DAG:
"""
PC Algorithm: Constraint-based causal discovery.
1. Start with complete undirected graph
2. Remove edges based on conditional independence tests
3. Orient edges using v-structures and rules
"""
from causallearn.search.ConstraintBased.PC import pc
result = pc(self.data.values)
return result.G
def gflownet_discovery(self) -> Distribution[DAG]:
"""
Use GFlowNet to sample DAGs proportional to likelihood.
This gives a DISTRIBUTION over causal graphs,
properly accounting for uncertainty.
"""
from gflownet import CausalDAGGFlowNet
gfn = CausalDAGGFlowNet(n_variables=len(self.data.columns))
gfn.train(reward=lambda g: self.bayesian_score(g))
# Sample multiple DAGs
dag_samples = [gfn.sample() for _ in range(1000)]
return dag_samples
```
## System 2 Deep Learning
```python
class System2Network:
"""
Bengio's vision: Combine System 1 (fast) with System 2 (slow).
System 1: Neural pattern matching (current DL)
System 2: Deliberate causal reasoning (compositional, symbolic)
"""
def __init__(self):
self.system1 = NeuralNetwork() # Fast intuition
self.system2 = CausalReasoner() # Slow reasoning
self.attention = DynamicAttention() # Which to use when
def forward(self, x: Tensor, requires_reasoning: bool = False) -> Tensor:
"""
Hybrid forward pass.
"""
# System 1: quick answer
fast_answer = self.system1(x)
if not requires_reasoning:
return fast_answer
# System 2: verify/refine via causal reasoning
slow_answer = self.system2.reason(x, fast_answer)
# Combine based on confidence
confidence = self.attention(x, fast_answer, slow_answer)
return confidence * fast_answer + (1 - confidence) * slow_answer
def causal_attention(self, query: Tensor) -> Tensor:
"""
Attention guided by causal relevance, not just correlation.
Standard attention: A(Q, K, V) = softmax(QK^T/√d) V
Causal attention: Weight by causal effect, not correlation
"""
correlational_weights = self.compute_attention(query)
causal_effects = self.system2.estimate_effects(query)
# Reweight by causal importance
causal_weights = correlational_weights * causal_effects
return self.apply_attention(causal_weights)
```
## GF(3) Triads
```
# Causal-Categorical Triad
sheaf-cohomology (-1) ⊗ causal-inference (0) ⊗ gflownet (+1) = 0 ✓
# System 2 Triad
proofgeneral-narya (-1) ⊗ causal-inference (0) ⊗ forward-forward-learning (+1) = 0 ✓
# World Model Triad
persistent-homology (-1) ⊗ causal-inference (0) ⊗ self-evolving-agent (+1) = 0 ✓
```
## Integration with Interaction Entropy
```ruby
module CausalInference
def self.intervene(interaction_sequence, intervention)
# Build causal graph from interaction sequence
graph = build_causal_graph(interaction_sequence)
# Apply intervention
intervened = graph.do(intervention)
# Predict outcome
outcome = intervened.forward_propagate
{
original_graph: graph,
intervention: intervention,
predicted_outcome: outcome,
trit: 0 # Coordinator (bridges observational and interventional)
}
end
def self.counterfactual(interaction, alternative_action)
# What would have happened if we'd done alternative_action?
noise = abduct_noise(interaction)
intervened = apply_intervention(alternative_action)
counterfactual_outcome = propagate_with_noise(intervened, noise)
{
actual_outcome: interaction[:outcome],
counterfactual_outcome: counterfactual_outcome,
difference: interaction[:outcome] - counterfactual_outcome
}
end
end
```
## Key Properties
1. **Invariance**: Causal mechanisms are stable across environments
2. **Modularity**: Can change one mechanism without affecting others
3. **Compositionality**: Complex models from simple causal primitives
4. **Identifiability**: Can (sometimes) learn causal structure from data
## References
1. Bengio, Y. et al. (2019). "A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms."
2. Schölkopf, B. et al. (2021). "Toward Causal Representation Learning."
3. Pearl, J. (2009). *Causality: Models, Reasoning, and Inference*.
4. Bengio, Y. (2017). "The Consciousness Prior."Related Skills
active-inference-robotics
Second-order skill synthesizing Patrick Kenny's discrete active inference framework with K-Scale's JAX/MuJoCo robotics stack for predictive coding in robot locomotion
type-inference-validation
Static type inference and validation for navigation paths
zx-calculus
Coecke's ZX-calculus for quantum circuit reasoning via string diagrams with Z-spiders (green) and X-spiders (red)
zulip-cogen
Zulip Cogen Skill 🐸⚡
zls-integration
zls-integration skill
zig
zig skill
zig-syrup-bci
Multimodal BCI pipeline in Zig: DSI-24 EEG, fNIRS mBLL, eye tracking IVT, LSL sync, EDF read/write, GF(3) conservation
zig-programming
zig-programming skill
zeroth-bot
Zeroth Bot - 3D-printed open-source humanoid robot platform for sim-to-real and RL research. Affordable entry point for humanoid robotics.
xlsx
Comprehensive spreadsheet creation, editing, and analysis with support
wycheproof
Google's Wycheproof test vectors for cryptographic implementation testing.
Writing Hookify Rules
This skill should be used when the user asks to "create a hookify rule", "write a hook rule", "configure hookify", "add a hookify rule", or needs guidance on hookify rule syntax and patterns.