vaex

Vaex out-of-core DataFrames. Use for big data exploration.

7 stars

Best use case

vaex is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Vaex out-of-core DataFrames. Use for big data exploration.

Teams using vaex should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/vaex/SKILL.md --create-dirs "https://raw.githubusercontent.com/G1Joshi/Agent-Skills/main/skills/ai-ml/vaex/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/vaex/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How vaex Compares

Feature / AgentvaexStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Vaex out-of-core DataFrames. Use for big data exploration.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Vaex

Vaex is a hidden gem. It uses **Memory Mapping** to open 100GB files instantly on a laptop and visualize them.

## When to Use

- **Visualization**: Plotting 1 billion points via heatmaps.
- **Instant Opening**: No "loading" time for HDF5/Arrow files.
- **String Processing**: Very fast string operations.

## Core Concepts

### Out-of-Core

Data stays on disk. Virtual columns are computed on the fly.

### Binning

Aggregating data into a grid for visualization (heatmap) rather than plotting individual points.

## Best Practices (2025)

**Do**:

- **Convert to HDF5/Arrow**: Vaex shines with these formats.
- **Use it for EDA**: Rapidly exploring massive datasets locally.

**Don't**:

- **Don't use for complex unrelated logic**: It's specialized for columnar aggregation.

## References

- [Vaex Documentation](https://vaex.io/docs/index.html)