yann-lecun-tecnico
Sub-skill técnica de Yann LeCun. Cobre CNNs, LeNet, backpropagation, JEPA (I-JEPA, V-JEPA, MC-JEPA), AMI (Advanced Machinery of Intelligence), Self-Supervised Learning (SimCLR, MAE, BYOL), Energy-Based Models (EBMs) e código PyTorch completo.
Best use case
yann-lecun-tecnico is best used when you need a repeatable AI agent workflow instead of a one-off prompt. It is especially useful for teams working in multi. Sub-skill técnica de Yann LeCun. Cobre CNNs, LeNet, backpropagation, JEPA (I-JEPA, V-JEPA, MC-JEPA), AMI (Advanced Machinery of Intelligence), Self-Supervised Learning (SimCLR, MAE, BYOL), Energy-Based Models (EBMs) e código PyTorch completo.
Sub-skill técnica de Yann LeCun. Cobre CNNs, LeNet, backpropagation, JEPA (I-JEPA, V-JEPA, MC-JEPA), AMI (Advanced Machinery of Intelligence), Self-Supervised Learning (SimCLR, MAE, BYOL), Energy-Based Models (EBMs) e código PyTorch completo.
Users should expect a more consistent workflow output, faster repeated execution, and less time spent rewriting prompts from scratch.
Practical example
Example input
Use the "yann-lecun-tecnico" skill to help with this workflow task. Context: Sub-skill técnica de Yann LeCun. Cobre CNNs, LeNet, backpropagation, JEPA (I-JEPA, V-JEPA, MC-JEPA), AMI (Advanced Machinery of Intelligence), Self-Supervised Learning (SimCLR, MAE, BYOL), Energy-Based Models (EBMs) e código PyTorch completo.
Example output
A structured workflow result with clearer steps, more consistent formatting, and an output that is easier to reuse in the next run.
When to use this skill
- Use this skill when you want a reusable workflow rather than writing the same prompt again and again.
When not to use this skill
- Do not use this when you only need a one-off answer and do not need a reusable workflow.
- Do not use it if you cannot install or maintain the related files, repository context, or supporting tools.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/yann-lecun-tecnico/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How yann-lecun-tecnico Compares
| Feature / Agent | yann-lecun-tecnico | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Sub-skill técnica de Yann LeCun. Cobre CNNs, LeNet, backpropagation, JEPA (I-JEPA, V-JEPA, MC-JEPA), AMI (Advanced Machinery of Intelligence), Self-Supervised Learning (SimCLR, MAE, BYOL), Energy-Based Models (EBMs) e código PyTorch completo.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
AI Agents for Startups
Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
SKILL.md Source
# YANN LECUN — MÓDULO TÉCNICO v3.0
## Overview
Sub-skill técnica de Yann LeCun. Cobre CNNs, LeNet, backpropagation, JEPA (I-JEPA, V-JEPA, MC-JEPA), AMI (Advanced Machinery of Intelligence), Self-Supervised Learning (SimCLR, MAE, BYOL), Energy-Based Models (EBMs) e código PyTorch completo.
## When to Use This Skill
- When you need specialized assistance with this domain
## Do Not Use This Skill When
- The task is unrelated to yann lecun tecnico
- A simpler, more specific tool can handle the request
- The user needs general-purpose assistance without domain expertise
## How It Works
> Este módulo é carregado pelo agente yann-lecun principal quando a conversa
> exige profundidade técnica. Você continua sendo LeCun — apenas com acesso
> a todo o arsenal técnico.
---
## Convolutional Neural Networks: Do Princípio
A operação de convolução 2D discreta:
```
Saida[i][j] = sum_{m} sum_{n} Input[i+m][j+n] * Kernel[m][n]
```
O insight arquitetural **triplo** das CNNs:
**1. Local Connectivity**
```
## Antes (Fully Connected): Neurônio I -> Todos Os Pixels
params = input_size * hidden_size # enorme
## Cnns: Neurônio -> Região Local [K X K]
params = kernel_h * kernel_w * in_channels * out_channels
## Fisicamente Motivado: Features Visuais São Locais
```
**2. Weight Sharing**
```
## Resultado: Translation Equivariance
for i in range(output_height):
for j in range(output_width):
output[i][j] = conv2d(input[i:i+k, j:j+k], shared_kernel)
```
**3. Hierarquia de Representações**
```
## Total: ~60,000 Parâmetros
```
O insight central: **features não precisam ser handcrafted**. Aprendem por gradiente.
Em 2012, AlexNet provou. Eu dizia isso desde 1989.
## Backpropagation: A Equação Central
```
delta_L = dL/da_L (gradiente na camada de saída)
delta_l = (W_{l+1}^T * delta_{l+1}) * f'(z_l)
dL/dW_l = delta_l * a_{l-1}^T
dL/db_l = delta_l
```
Backprop não é algoritmo milagroso. É chain rule aplicada a funções compostas.
Implementável eficientemente em GPUs por ser sequência de multiplicações de matrizes.
## Self-Supervised Learning: Objetivos E Formalização
**Variante generativa (MAE, BERT)**:
```
L_gen = E[||f_theta(x_masked) - x_target||^2]
## Para Imagens: Cada Pixel. Desperdiçador De Capacidade.
```
**Variante contrastiva (SimCLR, MoCo)**:
```
L_contrastive = -log( exp(sim(z_i, z_j) / tau) /
sum_k exp(sim(z_i, z_k) / tau) )
## Tau: Temperature Hyperparameter
```
Problema das contrastivas: precisam de "negatives" — batch grande. Motivou BYOL e JEPA.
---
## Formulação Central
JEPA: **prever em espaço de representações, não em espaço de inputs**.
```
## Dois Encoders (Ou Um Com Stop-Gradient):
s_x = f_theta(x) # contexto encoder
s_y = f_theta_bar(y) # target encoder (momentum de theta)
## Predictor:
s_hat_y = g_phi(s_x) # prevê representação de y dado x
## Objetivo:
L_JEPA = ||s_y - s_hat_y||^2 # MSE no espaço de representações
## Prevenção De Colapso: Target Encoder Usa Momentum (Ema)
theta_bar <- m * theta_bar + (1-m) * theta # m ~ 0.996
```
**Por que JEPA supera geração de pixels/tokens**:
| Abordagem | Prevê | Capacidade gasta em | Semântica |
|-----------|-------|---------------------|-----------|
| MAE | Pixels exatos | Texturas, ruídos, irrelevantes | Custosamente |
| BERT | Tokens exatos | Detalhes lexicais | Custosamente |
| Contrastiva | Invariâncias | Negativos (batch grande) | Sim |
| **JEPA** | **Representação abstrata** | **Relações semânticas** | **Eficientemente** |
## I-Jepa: Pseudocódigo Pytorch Completo
```python
import torch
import torch.nn as nn
import torch.nn.functional as F
import copy
class IJEPA(nn.Module):
"""
I-JEPA: Image Joint Embedding Predictive Architecture
Assran et al. 2023 — CVPR
"""
def __init__(self, encoder, predictor, momentum=0.996):
super().__init__()
self.context_encoder = encoder
self.target_encoder = copy.deepcopy(encoder)
self.predictor = predictor
self.momentum = momentum
for param in self.target_encoder.parameters():
param.requires_grad = False
@torch.no_grad()
def update_target_encoder(self):
"""EMA update"""
for param_ctx, param_tgt in zip(
self.context_encoder.parameters(),
self.target_encoder.parameters()
):
param_tgt.data = (
self.momentum * param_tgt.data +
(1 - self.momentum) * param_ctx.data
)
def forward(self, images):
context_patches, target_patches, masks = self.create_masks(images)
context_embeds = self.context_encoder(context_patches, masks)
with torch.no_grad():
target_embeds = self.target_encoder(target_patches)
predicted_embeds = self.predictor(context_embeds, target_positions)
loss = F.mse_loss(predicted_embeds, target_embeds.detach())
return loss
def create_masks(self, images, num_target_blocks=4, context_scale=0.85):
"""
Estratégia I-JEPA:
- Múltiplos blocos alvo aleatórios (alto aspect ratio)
- Contexto: imagem com blocos alvo mascarados
"""
B, C, H, W = images.shape
patch_size = 16
n_patches_h = H // patch_size
n_patches_w = W // patch_size
target_masks = generate_random_blocks(
n_patches_h, n_patches_w,
num_blocks=num_target_blocks,
scale_range=(0.15, 0.2),
aspect_ratio_range=(0.75, 1.5)
)
context_mask = ~targe
## V-Jepa: Extensão Temporal
```python
## Prever Representação De Frames Futuros Em Posições Mascaradas
L_V_JEPA = E[||f_target(video_masked) - g(f_ctx(video_ctx), positions)||^2]
## Sem Nenhum Label.
```
## Hierarquia De Encoders
Level 0: pixels -> patches -> representações locais (bordas, texturas)
Level 1: patches -> regiões -> representações de objetos
Level 2: regiões -> cena -> representações de relações espaciais
Level 3: cena -> temporal -> representações de eventos
## Cada Nível Tem Seu Próprio Jepa:
L_total = sum_l lambda_l * L_JEPA_l
## Resultado: World Model Hierárquico Multi-Escala
```
---
## Seção Ami — Advanced Machinery Of Intelligence
Paper: "A Path Towards Autonomous Machine Intelligence" (2022)
## Os 6 Módulos Do Ami
```
+----------------------------------------------------------+
| SISTEMA AMI COMPLETO |
| |
| +-----------+ +------------------+ |
| | Perceptor | | World Model | |
| | (encoders)| | (JEPA hierárquico)| |
| +-----------+ +------------------+ |
| | | |
| v v |
| +----------+ +------------------+ |
| | Memory |<-->| Cost Module | |
| | (epis, | | (intrínseco + | |
| | semant) | | configurável) | |
| +----------+ +------------------+ |
| | |
| +------------------+ |
| | Actor (planner | |
| | + executor) | |
| +------------------+ |
+----------------------------------------------------------+
```
**Módulo 1 — Configurator**: Configura os outros módulos para a tarefa atual.
**Módulo 2 — Perception**: Encoders sensório-motores que alimentam o world model.
**Módulo 3 — World Model** (coração do sistema):
```
## Simulação Interna: "O Que Acontece Se Eu Fizer X?"
predicted_next_state = world_model(current_state, action_X)
cost_predicted = cost_module(predicted_next_state)
## Escolhe Ação Que Minimiza O Custo
```
**Módulo 4 — Cost Module**:
```
## Dois Tipos De Custo:
E(s) = alpha * intrinsic_cost(s) + beta * task_cost(s)
## Task_Cost: Objetivo Configurável Por Tarefa/Humano
```
**Módulo 5 — Short-term Memory**: Buffer de estados, simulações, contexto imediato.
**Módulo 6 — Actor**:
- Modo reativo: ações diretas do estado atual
- Modo deliberativo: simula múltiplos futuros, escolhe mínimo custo
## Ami Vs Llms
| Feature | LLM | AMI |
|---------|-----|-----|
| Objetivo | Prever próximo token | Minimizar erro em representação |
| World model | Nenhum | Módulo dedicado central |
| Planning | Texto sobre planning | Planning real com simulação |
| Memória | Context window (fixo) | Memória episódica atualizável |
| Objetivos | Apenas treinamento | Cost module configurável |
| Input | Texto | Multi-modal (video, audio, propriocepção) |
| Causalidade | Correlacional | Causal (dinâmicas do mundo) |
---
## Seção Ebm — Energy-Based Models
Contribuição subestimada que vai ser mais influente a longo prazo.
**O problema com probabilísticos**:
```
P(x) = exp(-E(x)) / Z
Z = integral exp(-E(x)) dx # intratável em alta dimensão!
```
**A solução EBM**: esquecer Z. Defina E(x) onde:
- Baixa energia = configuração compatível com dados observados
- Alta energia = configuração incompatível
```python
class EnergyBasedModel(nn.Module):
"""
EBM: F(x) = energia de x
P(x) ~ exp(-F(x)) / Z — mas nunca calculamos Z!
Vantagem: sem partition function intratável.
"""
def __init__(self, latent_dim=512):
super().__init__()
self.energy_net = nn.Sequential(
nn.Linear(latent_dim, 256),
nn.SiLU(),
nn.Linear(256, 128),
nn.SiLU(),
nn.Linear(128, 1) # escalar: energia
)
def energy(self, x):
return self.energy_net(x).squeeze(-1)
def contrastive_loss(self, x_pos, x_neg):
"""
L = E[F(x_pos)] - E[F(x_neg)] + regularização
Queremos: E_pos < E_neg
"""
E_pos = self.energy(x_pos)
E_neg = self.energy(x_neg)
loss = E_pos.mean() - E_neg.mean()
reg = 0.1 * (E_pos.pow(2).mean() + E_neg.pow(2).mean())
return loss + reg
## Ebms Capturam Isso Naturalmente — São Sobre Compatibilidade, Não Probabilidade."
```
**JEPA como EBM no espaço de representações**:
```
E(x, y) = ||f_theta(x) - g_phi(f_theta_bar(y))||^2
## Simclr Simplificado
```python
import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision.transforms as T
class ProjectionHead(nn.Module):
"""MLP que projeta representações para espaço contrastivo"""
def __init__(self, in_dim=512, hidden_dim=256, out_dim=128):
super().__init__()
self.net = nn.Sequential(
nn.Linear(in_dim, hidden_dim),
nn.BatchNorm1d(hidden_dim),
nn.ReLU(inplace=True),
nn.Linear(hidden_dim, out_dim)
)
def forward(self, x):
return F.normalize(self.net(x), dim=-1)
class SimCLRLoss(nn.Module):
"""NT-Xent Loss (Chen et al. 2020)"""
def __init__(self, temperature=0.5):
super().__init__()
self.temp = temperature
def forward(self, z1, z2):
"""
z1, z2: [B, D] — duas views do mesmo batch
z1[i] e z2[i]: positive pair
Todos outros pares: negatives
"""
B = z1.size(0)
z = torch.cat([z1, z2], dim=0)
sim = torch.mm(z, z.t()) / self.temp
mask = torch.eye(2*B, device=z.device).bool()
sim.masked_fill_(mask, float('-inf'))
labels = torch.arange(B, device=z.device)
labels = torch.cat([labels + B, labels])
return F.cross_entropy(sim, labels)
def get_ssl_augmentations(size=224):
"""
As augmentações DEFINEM o que o modelo aprende a ser invariante.
Rotação -> invariância a rotação.
Crop -> invariância a posição.
"""
return T.Compose([
T.RandomResizedCrop(size, scale=(0.2, 1.0)),
T.RandomHorizontalFlip(),
T.ColorJitter(brightness=0.4, contrast=0.4, saturation=0.4, hue=0.1),
T.RandomGrayscale(p=0.2),
T.GaussianBlur(kernel_size=size//10*2+1, sigma=(0.1, 2.0)),
T.ToTensor(),
T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])
```
## Lenet-5 Original Em Pytorch Moderno
```python
class LeNet5Modern(nn.Module):
"""
LeNet-5 (LeCun et al. 1998) reimplementada em PyTorch moderno.
Esta arquitetura rodou em produção no Bank of America em 1993.
~60,000 parâmetros. Mesmos princípios de modelos modernos com bilhões.
"""
def __init__(self, num_classes=10):
super().__init__()
self.features = nn.Sequential(
nn.Conv2d(1, 6, kernel_size=5, padding=2),
nn.Tanh(),
nn.AvgPool2d(kernel_size=2, stride=2),
nn.Conv2d(6, 16, kernel_size=5),
nn.Tanh(),
nn.AvgPool2d(kernel_size=2, stride=2),
nn.Conv2d(16, 120, kernel_size=5),
nn.Tanh(),
)
self.classifier = nn.Sequential(
nn.Linear(120, 84),
nn.Tanh(),
nn.Linear(84, num_classes),
)
def forward(self, x):
x = self.features(x) # [B, 120, 1, 1]
x = x.view(x.size(0), -1)
return self.classifier(x)
```
---
## Papers Fundamentais (Lecun)
- LeCun et al. (1998). "Gradient-Based Learning Applied to Document Recognition" — IEEE 86(11)
- LeCun et al. (2015). "Deep Learning" — Nature 521:436-444
- LeCun (2022). "A Path Towards Autonomous Machine Intelligence" — OpenReview preprint
## Jepa Papers
- Assran et al. (2023). "Self-Supervised Learning from Images with a JEPA" — CVPR 2023 (I-JEPA)
- Bardes et al. (2024). "V-JEPA: Self-Supervised Learning of Video Representations" — NeurIPS 2023
- LeCun (2016). "Predictive Learning" — NIPS Keynote (The Cake Analogy)
## Ssl Relevantes
- He et al. (2022). "Masked Autoencoders Are Scalable Vision Learners" — CVPR 2022
- Chen et al. (2020). "A Simple Framework for Contrastive Learning" (SimCLR) — ICML 2020
- Grill et al. (2020). "Bootstrap Your Own Latent" (BYOL) — NeurIPS 2020
## Energy-Based Models
- LeCun et al. (2006). "A Tutorial on Energy-Based Learning" — ICLR Workshop
- LeCun (2021). "Energy-Based Models for Autonomous and Predictive Learning" — ICLR Keynote
## Best Practices
- Provide clear, specific context about your project and requirements
- Review all suggestions before applying them to production code
- Combine with other complementary skills for comprehensive analysis
## Common Pitfalls
- Using this skill for tasks outside its domain expertise
- Applying recommendations without understanding your specific context
- Not providing enough project context for accurate analysis
## Related Skills
- `yann-lecun` - Complementary skill for enhanced analysis
- `yann-lecun-debate` - Complementary skill for enhanced analysis
- `yann-lecun-filosofia` - Complementary skill for enhanced analysis
## Limitations
- Use this skill only when the task clearly matches the scope described above.
- Do not treat the output as a substitute for environment-specific validation, testing, or expert review.
- Stop and ask for clarification if required inputs, permissions, safety boundaries, or success criteria are missing.Related Skills
yann-lecun
Agente que simula Yann LeCun — inventor das Convolutional Neural Networks, Chief AI Scientist da Meta, Prêmio Turing 2018.
yann-lecun-filosofia
Sub-skill filosófica e pedagógica de Yann LeCun.
yann-lecun-debate
Sub-skill de debates e posições de Yann LeCun. Cobre críticas técnicas detalhadas aos LLMs, rivalidades intelectuais (LeCun vs Hinton, Sutskever, Russell, Yudkowsky, Bostrom), lista completa de rejeições a afirmações mainstream, posição sobre risco existencial de IA, e técnicas de debate ao vivo.
nextjs-best-practices
Next.js App Router principles. Server Components, data fetching, routing patterns.
network-101
Configure and test common network services (HTTP, HTTPS, SNMP, SMB) for penetration testing lab environments. Enable hands-on practice with service enumeration, log analysis, and security testing against properly configured target systems.
neon-postgres
Expert patterns for Neon serverless Postgres, branching, connection pooling, and Prisma/Drizzle integration
nanobanana-ppt-skills
AI-powered PPT generation with document analysis and styled images
multi-agent-patterns
This skill should be used when the user asks to "design multi-agent system", "implement supervisor pattern", "create swarm architecture", "coordinate multiple agents", or mentions multi-agent patterns, context isolation, agent handoffs, sub-agents, or parallel agent execution.
monorepo-management
Build efficient, scalable monorepos that enable code sharing, consistent tooling, and atomic changes across multiple packages and applications.
monetization
Estrategia e implementacao de monetizacao para produtos digitais - Stripe, subscriptions, pricing experiments, freemium, upgrade flows, churn prevention, revenue optimization e modelos de negocio SaaS.
modern-javascript-patterns
Comprehensive guide for mastering modern JavaScript (ES6+) features, functional programming patterns, and best practices for writing clean, maintainable, and performant code.
microservices-patterns
Master microservices architecture patterns including service boundaries, inter-service communication, data management, and resilience patterns for building distributed systems.