drone-cv-expert

Expert in drone systems, computer vision, and autonomous navigation. Specializes in flight control, SLAM, object detection, sensor fusion, and path planning. Activate on "drone", "UAV", "SLAM", "visual odometry", "PID control", "MAVLink", "Pixhawk", "path planning", "A*", "RRT", "EKF", "sensor fusion", "optical flow", "ByteTrack". NOT for domain-specific inspection tasks like fire detection, roof damage assessment, or thermal analysis (use drone-inspection-specialist), GPU shader optimization (use metal-shader-expert), or general image classification without drone context (use clip-aware-embeddings).

85 stars

bycuriositech

View on GitHub Installation ↓

Best use case

drone-cv-expert is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using drone-cv-expert should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/drone-cv-expert/SKILL.md --create-dirs "https://raw.githubusercontent.com/curiositech/some_claude_skills/main/.claude/skills/drone-cv-expert/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/drone-cv-expert/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How drone-cv-expert Compares

Feature / Agent	drone-cv-expert	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Drone CV Expert

Expert in robotics, drone systems, and computer vision for autonomous aerial platforms.

## Decision Tree: When to Use This Skill

```
User mentions drones or UAVs?
├─ YES → Is it about inspection/detection of specific things (fire, roof damage, thermal)?
│        ├─ YES → Use drone-inspection-specialist
│        └─ NO → Is it about flight control, navigation, or general CV?
│                ├─ YES → Use THIS SKILL (drone-cv-expert)
│                └─ NO → Is it about GPU rendering/shaders?
│                        ├─ YES → Use metal-shader-expert
│                        └─ NO → Use THIS SKILL as default drone skill
└─ NO → Is it general object detection without drone context?
        ├─ YES → Use clip-aware-embeddings or other CV skill
        └─ NO → Probably not a drone question
```

## Core Competencies

### Flight Control & Navigation
- **PID Tuning**: Position, velocity, attitude control loops
- **SLAM**: ORB-SLAM, LSD-SLAM, visual-inertial odometry (VIO)
- **Path Planning**: A*, RRT, RRT*, Dijkstra, potential fields
- **Sensor Fusion**: EKF, UKF, complementary filters
- **GPS-Denied Navigation**: AprilTags, visual odometry, LiDAR SLAM

### Computer Vision
- **Object Detection**: YOLO (v5/v8/v10), EfficientDet, SSD
- **Tracking**: ByteTrack, DeepSORT, SORT, optical flow
- **Edge Deployment**: TensorRT, ONNX, OpenVINO optimization
- **3D Vision**: Stereo depth, point clouds, structure-from-motion

### Hardware Integration
- **Flight Controllers**: Pixhawk, Ardupilot, PX4, DJI
- **Protocols**: MAVLink, DroneKit, MAVSDK
- **Edge Compute**: Jetson (Nano/Xavier/Orin), Coral TPU
- **Sensors**: IMU, GPS, barometer, LiDAR, depth cameras

## Anti-Patterns to Avoid

### 1. "Simulation-Only Syndrome"
**Wrong**: Testing only in Gazebo/AirSim, then deploying directly to real drone.
**Right**: Simulation → Bench test → Tethered flight → Controlled environment → Field.

### 2. "EKF Overkill"
**Wrong**: Using Extended Kalman Filter when complementary filter suffices.
**Right**: Match filter complexity to requirements:
- Complementary filter: Basic stabilization, attitude only
- EKF: Multi-sensor fusion, GPS+IMU+baro
- UKF: Highly nonlinear systems, aggressive maneuvers

### 3. "Max Resolution Assumption"
**Wrong**: Processing 4K frames at 30fps expecting real-time performance.
**Right**: Resolution trade-offs by altitude/speed:
| Altitude | Speed | Resolution | FPS | Rationale |
|----------|-------|------------|-----|-----------|
| &lt;30m | Slow | 1920x1080 | 30 | Detail needed |
| 30-100m | Medium | 1280x720 | 30 | Balance |
| &gt;100m | Fast | 640x480 | 60 | Speed priority |

### 4. "Single-Thread Processing"
**Wrong**: Sequential detect → track → control in one loop.
**Right**: Pipeline parallelism:
```
Thread 1: Camera capture (async)
Thread 2: Object detection (GPU)
Thread 3: Tracking + state estimation
Thread 4: Control commands
```

### 5. "GPS Trust"
**Wrong**: Assuming GPS is always accurate and available.
**Right**: Multi-source position estimation:
- GPS: 2-5m accuracy outdoor, unavailable indoor
- Visual odometry: 0.1-1% drift, lighting dependent
- AprilTags: cm-level accuracy where deployed
- IMU: Short-term only, drift accumulates

### 6. "One Model Fits All"
**Wrong**: Using same YOLO model for all scenarios.
**Right**: Model selection by constraint:
| Constraint | Model | Notes |
|------------|-------|-------|
| Latency critical | YOLOv8n | 6ms inference |
| Balanced | YOLOv8s | 15ms, better accuracy |
| Accuracy first | YOLOv8x | 50ms, highest mAP |
| Edge device | YOLOv8n + TensorRT | 3ms on Jetson |

## Problem-Solving Framework

### 1. Constraint Analysis
- **Compute**: What hardware? (Jetson Nano = ~5 TOPS, Xavier = 32 TOPS)
- **Power**: Battery capacity? Flight time impact?
- **Latency**: Control loop rate? Detection response time?
- **Weight**: Payload capacity? Center of gravity?
- **Environment**: Indoor/outdoor? GPS available? Lighting conditions?

### 2. Algorithm Selection Matrix

| Problem | Classical Approach | Deep Learning | When to Use Each |
|---------|-------------------|---------------|------------------|
| Feature tracking | KLT optical flow | FlowNet | Classical: Real-time, limited compute. DL: Robust, more compute |
| Object detection | HOG+SVM | YOLO/SSD | Classical: Simple objects, no GPU. DL: Complex, GPU available |
| SLAM | ORB-SLAM | DROID-SLAM | Classical: Mature, debuggable. DL: Better in challenging scenes |
| Path planning | A*, RRT | RL-based | Classical: Known environments. DL: Complex, dynamic |

### 3. Safety Checklist
- [ ] Kill switch tested and accessible
- [ ] Geofence configured
- [ ] Return-to-home altitude set
- [ ] Low battery action defined
- [ ] Signal loss action defined
- [ ] Propeller guards (if applicable)
- [ ] Pre-flight sensor calibration
- [ ] Weather conditions checked

## Quick Reference Tables

### MAVLink Message Types
| Message | Purpose | Frequency |
|---------|---------|-----------|
| HEARTBEAT | Connection alive | 1 Hz |
| ATTITUDE | Roll/pitch/yaw | 10-100 Hz |
| LOCAL_POSITION_NED | Position | 10-50 Hz |
| GPS_RAW_INT | Raw GPS | 1-10 Hz |
| SET_POSITION_TARGET | Commands | As needed |

### Kalman Filter Tuning
| Matrix | High Values | Low Values |
|--------|-------------|------------|
| Q (process noise) | Trust measurements more | Trust model more |
| R (measurement noise) | Trust model more | Trust measurements more |
| P (initial covariance) | Uncertain initial state | Confident initial state |

### Common Coordinate Frames
| Frame | Origin | Axes | Use |
|-------|--------|------|-----|
| NED | Takeoff point | North-East-Down | Navigation |
| ENU | Takeoff point | East-North-Up | ROS standard |
| Body | Drone CG | Forward-Right-Down | Control |
| Camera | Lens center | Right-Down-Forward | Vision |

## Reference Files

Detailed implementations in `references/`:
- `navigation-algorithms.md` - SLAM, path planning, localization
- `sensor-fusion-ekf.md` - Kalman filters, multi-sensor fusion
- `object-detection-tracking.md` - YOLO, ByteTrack, optical flow

## Simulation Tools

| Tool | Strengths | Weaknesses | Best For |
|------|-----------|------------|----------|
| Gazebo | ROS integration, physics | Graphics quality | ROS development |
| AirSim | Photorealistic, CV-focused | Windows-centric | Vision algorithms |
| Webots | Multi-robot, accessible | Less drone-specific | Swarm simulations |
| MATLAB/Simulink | Control design | Not real-time | Controller tuning |

## Emerging Technologies (2024-2025)

- **Event cameras**: 1μs temporal resolution, no motion blur
- **Neuromorphic computing**: Loihi 2 for ultra-low-power inference
- **4D Radar**: Velocity + 3D position, works in all weather
- **Swarm autonomy**: Decentralized coordination, emergent behavior
- **Foundation models**: SAM, CLIP for zero-shot detection

## Integration Points

- **drone-inspection-specialist**: Domain-specific detection (fire, damage, thermal)
- **metal-shader-expert**: GPU-accelerated vision processing, custom shaders
- **collage-layout-expert**: Report generation, visual composition

---

**Key Principle**: In drone systems, reliability trumps performance. A 95% accurate system that never crashes is better than 99% accurate that fails unpredictably. Always have fallbacks.

Related Skills

web-design-expert

from curiositech/some_claude_skills

Creates unique web designs with brand identity, color palettes, typography, and modern UI/UX patterns. Use for brand identity development, visual design systems, layout composition, and responsive web design. Activate on "web design", "brand identity", "color palette", "UI design", "visual design", "layout". NOT for typography details (use typography-expert), color theory deep-dives (use color-theory-expert), design system tokens (use design-system-creator), or code implementation without design direction.

typography-expert

from curiositech/some_claude_skills

Master typographer specializing in font pairing, typographic hierarchy, OpenType features, variable fonts, and performance-optimized web typography. Use for font selection, type scales, web font optimization, and typographic systems. Activate on "typography", "font pairing", "type scale", "variable fonts", "web fonts", "OpenType", "font loading". NOT for logo design, icon fonts, general CSS styling, or image-based typography.

test-automation-expert

from curiositech/some_claude_skills

Comprehensive test automation specialist covering unit, integration, and E2E testing strategies. Expert in Jest, Vitest, Playwright, Cypress, pytest, and modern testing frameworks. Guides test pyramid design, coverage optimization, flaky test detection, and CI/CD integration. Activate on 'test strategy', 'unit tests', 'integration tests', 'E2E testing', 'test coverage', 'flaky tests', 'mocking', 'test fixtures', 'TDD', 'BDD', 'test automation'. NOT for manual QA processes, load/performance testing (use performance-engineer), or security testing (use security-auditor).

terraform-iac-expert

from curiositech/some_claude_skills

Terraform and OpenTofu infrastructure as code — module design, state management, multi-environment setups, remote backends, secrets management, CI/CD integration. NOT for Pulumi, CDK, Ansible, or Kubernetes manifests.

seo-visibility-expert

from curiositech/some_claude_skills

Comprehensive SEO, discoverability, and AI crawler optimization for web projects. Use for technical SEO audits, llms.txt/robots.txt setup, schema markup, social launch strategies (Product Hunt, HN, Reddit), and Answer Engine Optimization (AEO). Activate on 'SEO', 'discoverability', 'llms.txt', 'robots.txt', 'Product Hunt', 'launch strategy', 'get traffic', 'be found', 'search ranking'. NOT for paid advertising, PPC campaigns, or social media content creation (use marketing skills).

reactflow-expert

from curiositech/some_claude_skills

Builds DAG visualizations using ReactFlow v12 with custom nodes, ELKjs auto-layout, Zustand state management, and live state updates via WebSocket. Use when implementing workflow visualization dashboards, creating custom agent node components, integrating ELK layout algorithms, or wiring execution state into React components. Activate on "ReactFlow", "workflow visualization", "DAG visualization", "ELKjs", "custom nodes", "node-based editor", "graph visualization". NOT for writing Mermaid diagrams (use mermaid-graph-writer), general React development, or static diagram rendering.

pwa-expert

from curiositech/some_claude_skills

Progressive Web App development with Service Workers, offline support, and app-like behavior. Use for caching strategies, install prompts, push notifications, background sync. Activate on "PWA", "Service Worker", "offline", "install prompt", "beforeinstallprompt", "manifest.json", "workbox", "cache-first". NOT for native app development (use React Native), general web performance (use performance docs), or server-side rendering.

physics-rendering-expert

from curiositech/some_claude_skills

Real-time rope/cable physics using Position-Based Dynamics (PBD), Verlet integration, and constraint solvers. Expert in quaternion math, Gauss-Seidel/Jacobi solvers, and tangling detection. Activate on 'rope simulation', 'PBD', 'Position-Based Dynamics', 'Verlet', 'constraint solver', 'quaternion', 'cable dynamics', 'cloth simulation', 'leash physics'. NOT for fluid dynamics (SPH/MPM), fracture simulation (FEM), offline cinematic physics, molecular dynamics, or general game physics engines (use Unity/Unreal built-ins).

photo-content-recognition-curation-expert

from curiositech/some_claude_skills

Expert in photo content recognition, intelligent curation, and quality filtering. Specializes in face/animal/place recognition, perceptual hashing for de-duplication, screenshot/meme detection, burst photo selection, and quick indexing strategies. Activate on 'face recognition', 'face clustering', 'perceptual hash', 'near-duplicate', 'burst photo', 'screenshot detection', 'photo curation', 'photo indexing', 'NSFW detection', 'pet recognition', 'DINOHash', 'HDBSCAN faces'. NOT for GPS-based location clustering (use event-detection-temporal-intelligence-expert), color palette extraction (use color-theory-palette-harmony-expert), semantic image-text matching (use clip-aware-embeddings), or video analysis/frame extraction.

nextjs-app-router-expert

from curiositech/some_claude_skills

Expert in Next.js 14/15 App Router architecture, React Server Components (RSC), Server Actions, and modern full-stack React development. Specializes in routing patterns, data fetching strategies, caching, streaming, and deployment optimization.

national-expungement-expert

from curiositech/some_claude_skills

Criminal record expungement laws across all 50 US states and DC — eligibility rules, waiting periods, filing processes, fees, Clean Slate laws, automatic expungement provisions. NOT for active criminal defense, immigration consequences, or federal record sealing.

metal-shader-expert

from curiositech/some_claude_skills

20 years Weta/Pixar experience in real-time graphics, Metal shaders, and visual effects. Expert in MSL shaders, PBR rendering, tile-based deferred rendering (TBDR), and GPU debugging. Activate on 'Metal shader', 'MSL', 'compute shader', 'vertex shader', 'fragment shader', 'PBR', 'ray tracing', 'tile shader', 'GPU profiling', 'Apple GPU'. NOT for WebGL/GLSL (different architecture), general OpenGL (deprecated on Apple), CUDA (NVIDIA only), or CPU-side rendering optimization.