hot3d
HOT3D (Hand-Object 3D Dataset) by Meta Facebook - multi-view egocentric hand and object 3D tracking for Aria/Quest smart glasses. State-of-the-art multi-view 3D hand pose, object pose, and hand-object interaction tracking. Supports visualization with 3D joint projections, meshes, and skeletal overlays on video frames.
Best use case
hot3d is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
HOT3D (Hand-Object 3D Dataset) by Meta Facebook - multi-view egocentric hand and object 3D tracking for Aria/Quest smart glasses. State-of-the-art multi-view 3D hand pose, object pose, and hand-object interaction tracking. Supports visualization with 3D joint projections, meshes, and skeletal overlays on video frames.
Teams using hot3d should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/hot3d/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How hot3d Compares
| Feature / Agent | hot3d | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
HOT3D (Hand-Object 3D Dataset) by Meta Facebook - multi-view egocentric hand and object 3D tracking for Aria/Quest smart glasses. State-of-the-art multi-view 3D hand pose, object pose, and hand-object interaction tracking. Supports visualization with 3D joint projections, meshes, and skeletal overlays on video frames.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# HOT3D - Multi-View 3D Hand & Object Tracking
## Overview
State-of-the-art multi-view 3D tracking system for egocentric hand-object interactions from Meta Facebook Research. Designed for Aria smart glasses and Quest VR headsets, HOT3D provides precise 3D world coordinates for hand joints, manipulated objects, and their interactions. The system includes visualization tools for rendering 3D overlays on video frames with joint projections, hand meshes, and object models.
**Project page**: https://facebookresearch.github.io/hot3d
**Best for**: Research-grade 3D tracking with multi-camera setups, high-precision applications, and XR device integration.
## When to Use This Skill
Use when you need:
- **Multi-view 3D tracking** with world coordinates
- **High-precision hand pose** in 3D space (millimeter accuracy)
- **Object tracking** during manipulation
- **Aria/Quest integration** for wearable devices
- **Research-grade tracking** benchmarks
- **Hand-object interaction** analysis in 3D
**vs alternatives**:
- More advanced than single-view methods (hands-3d-pose)
- Higher precision than bounding box detection (handtracking)
- Full 3D world coordinates vs 2D projections
## Core Capabilities
### 1. Multi-View 3D Hand Tracking
**21-keypoint 3D hand pose** from multiple synchronized cameras:
- 3D world coordinates (x, y, z) for each joint
- Joint confidence scores
- Left/right hand identification
- Temporal consistency across frames
- Hand mesh reconstruction
### 2. Object Pose Estimation
**6DOF object pose** tracking:
- 3D position and orientation (quaternion/rotation matrix)
- Object mesh alignment
- Tracking during manipulation
- Multiple object support
### 3. Hand-Object Interaction
**Interaction analysis**:
- Contact point detection
- Grasp type classification
- Manipulation phase detection
- Force estimation (with sensor data)
### 4. Visualization Tools
**Rich visualization options**:
- 3D skeleton projected to each camera view
- Hand mesh rendering
- Object model overlay
- Trajectory visualization
- Multi-view synchronized display
## Quick Start
```bash
# Clone repository
git clone https://github.com/facebookresearch/hot3d.git
cd hot3d
# Install dependencies
pip install -r requirements.txt
# Key: PyTorch3D, Open3D, vispy
# Download dataset (requires registration)
# https://facebookresearch.github.io/hot3d/dataset.html
# Run demo
python demo/visualize_tracking.py \
--sequence demo_sequence \
--output_dir ./visualizations
```
## Usage Example
```python
from hot3d import HOT3DTracker
import numpy as np
# Initialize tracker
tracker = HOT3DTracker()
tracker.load_sequence('path/to/sequence')
# Get frame data
frame_data = tracker.get_frame(frame_id=100)
# Access 3D hand pose
hand_pose_3d = frame_data['left_hand'] # 21x3 array
print(f"Wrist position: {hand_pose_3d[0]}") # [x, y, z]
# Access object pose
object_pose = frame_data['object_001']
position = object_pose['position'] # [x, y, z]
rotation = object_pose['rotation'] # 3x3 matrix
# Visualize
tracker.visualize_frame(
frame_id=100,
show_hands=True,
show_objects=True,
show_meshes=True,
save_path='output.png'
)
```
## Model Specs
- **Input**: Multi-view RGB-D video streams (typically 3-5 cameras)
- **Output**: 3D coordinates in world frame (millimeters)
- **Accuracy**: ~5-10mm hand joint error
- **Frame rate**: 30-60 Hz (depends on hardware)
- **Latency**: <100ms for real-time applications
## Requirements
- **Hardware**: Multi-camera setup or Aria/Quest device
- **Computation**: GPU recommended (NVIDIA RTX 3080 or better)
- **Storage**: Large dataset (several TB for full HOT3D)
- **Software**: PyTorch, PyTorch3D, Open3D
## Dataset
**HOT3D dataset** includes:
- 100+ sequences of daily activities
- Multi-view RGB-D video
- 3D hand and object annotations
- Aria/Quest recordings
- Smart glasses data
Access: https://facebookresearch.github.io/hot3d
## Integration
Works with:
- **hand-tracking-toolkit**: Evaluation and metrics
- **Aria SDK**: Device integration
- **PyTorch3D**: 3D processing
- **OpenXR**: XR platform integration
## Limitations
- Requires multi-view setup or specialized hardware
- Computational intensive
- Dataset access requires registration
- Complex setup compared to single-view methods
## Best For
- **XR applications** with smart glasses
- **Research** in 3D hand tracking
- **High-precision** manipulation analysis
- **Benchmarking** new algorithms
## References
- Project: https://facebookresearch.github.io/hot3d
- GitHub: https://github.com/facebookresearch/hot3d
- Paper: HOT3D dataset publication
- Citation: See project page
## License
CC-BY-NC 4.0 (non-commercial only)Related Skills
handtracking
Real-time hand detection in egocentric videos using victordibia/handtracking. Outputs bounding boxes for hands, specifically trained on EgoHands dataset. Supports video input/output with labeled hand boxes. Lightweight and fast for egocentric view applications.
hands-3d-pose
High-quality 3D hand pose estimation for egocentric videos from ECCV 2024 (ap229997/hands). Provides 3D joint keypoints and skeleton visualization projected to 2D. Optimized for daily egocentric activities with state-of-the-art accuracy. Outputs hand skeleton overlays on video frames.
hand-tracking-toolkit
Facebook Research Hand Tracking Challenge Toolkit - evaluation and visualization tools for 3D hand tracking. Supports loading HOT3D data, computing metrics (PA-MPJPE, AUC, etc.), visualizing 3D pose projections, and generating tracking evaluation reports. Essential for benchmarking hand tracking algorithms.
egohos-segmentation
Egocentric Hand-Object Segmentation (EgoHOS) - pixel-level hand and object segmentation in egocentric videos. Outputs fine-grained segmentation masks with hand regions highlighted. Specialized for hand-object interaction scenarios with pixel-accurate masks. Ideal for detailed interaction analysis.
zinc-database
Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.
torchdrug
PyTorch-native graph neural networks for molecules and proteins. Use when building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, retrosynthesis. For pre-trained models and diverse featurizers use deepchem; for benchmark datasets use pytdc.
torch-geometric
Graph Neural Networks (PyG). Node/graph classification, link prediction, GCN, GAT, GraphSAGE, heterogeneous graphs, molecular property prediction, for geometric deep learning.
tooluniverse-target-research
Gather comprehensive biological target intelligence from 9 parallel research paths covering protein info, structure, interactions, pathways, expression, variants, drug interactions, and literature. Features collision-aware searches, evidence grading (T1-T4), explicit Open Targets coverage, and mandatory completeness auditing. Use when users ask about drug targets, proteins, genes, or need target validation, druggability assessment, or comprehensive target profiling.
tooluniverse-protein-therapeutic-design
Design novel protein therapeutics (binders, enzymes, scaffolds) using AI-guided de novo design. Uses RFdiffusion for backbone generation, ProteinMPNN for sequence design, ESMFold/AlphaFold2 for validation. Use when asked to design protein binders, therapeutic proteins, or engineer protein function.
tooluniverse-pharmacovigilance
Analyze drug safety signals from FDA adverse event reports, label warnings, and pharmacogenomic data. Calculates disproportionality measures (PRR, ROR), identifies serious adverse events, assesses pharmacogenomic risk variants. Use when asked about drug safety, adverse events, post-market surveillance, or risk-benefit assessment.
tooluniverse-network-pharmacology
Construct and analyze compound-target-disease networks for drug repurposing, polypharmacology discovery, and systems pharmacology. Builds multi-layer networks from ChEMBL, OpenTargets, STRING, DrugBank, Reactome, FAERS, and 60+ other ToolUniverse tools. Calculates Network Pharmacology Scores (0-100), identifies repurposing candidates, predicts mechanisms, and analyzes polypharmacology. Use when users ask about drug repurposing via network analysis, multi-target drug effects, compound-target-disease networks, systems pharmacology, or polypharmacology.
tooluniverse-drug-target-validation
Comprehensive computational validation of drug targets for early-stage drug discovery. Evaluates targets across 10 dimensions (disambiguation, disease association, druggability, chemical matter, clinical precedent, safety, pathway context, validation evidence, structural insights, validation roadmap) using 60+ ToolUniverse tools. Produces a quantitative Target Validation Score (0-100) with GO/NO-GO recommendation. Use when users ask about target validation, druggability assessment, target prioritization, or "is X a good drug target for Y?"