sandbox-execution-guide

Secure sandboxed code execution environments for reproducible research computing

191 stars

Best use case

sandbox-execution-guide is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Secure sandboxed code execution environments for reproducible research computing

Teams using sandbox-execution-guide should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/sandbox-execution-guide/SKILL.md --create-dirs "https://raw.githubusercontent.com/wentorai/research-plugins/main/skills/tools/code-exec/sandbox-execution-guide/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/sandbox-execution-guide/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How sandbox-execution-guide Compares

Feature / Agent	sandbox-execution-guide	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Secure sandboxed code execution environments for reproducible research computing

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Sandbox Execution Guide

A skill for setting up and using sandboxed code execution environments for research computing. Covers containerized execution, security considerations, resource management, and integration with research workflows.

## Why Sandboxed Execution?

Research code often requires:
- Isolation from the host system for security
- Reproducible environments across machines
- Resource limits to prevent runaway computations
- Multi-language support (Python, R, Julia, MATLAB)

## Docker-Based Sandboxes

### Creating a Research Container

```dockerfile
# Dockerfile for a reproducible research environment
FROM python:3.11-slim

# System dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
    build-essential \
    gfortran \
    libopenblas-dev \
    && rm -rf /var/lib/apt/lists/*

# Create non-root user for security
RUN useradd -m -s /bin/bash researcher
USER researcher
WORKDIR /home/researcher

# Pin all dependencies
COPY requirements.txt .
RUN pip install --user --no-cache-dir -r requirements.txt

# Copy project files
COPY --chown=researcher:researcher . /home/researcher/project
WORKDIR /home/researcher/project

# Resource limits set at runtime, not build time
CMD ["python", "main.py"]
```

### Running with Resource Limits

```bash
# Run with CPU, memory, and time constraints
docker run \
  --cpus="2.0" \
  --memory="4g" \
  --memory-swap="4g" \
  --pids-limit=100 \
  --network=none \
  --read-only \
  --tmpfs /tmp:size=512m \
  --timeout 3600 \
  research-sandbox:latest python analysis.py

# Mount data as read-only, output directory as writable
docker run \
  -v /data/raw:/data:ro \
  -v /data/results:/output:rw \
  --cpus="4.0" \
  --memory="16g" \
  research-sandbox:latest python pipeline.py
```

## Python Sandbox with Resource Limits

### Process-Level Isolation

```python
import subprocess
import resource
import signal
import tempfile
import os

def run_sandboxed(code: str, timeout: int = 60,
                   max_memory_mb: int = 512) -> dict:
    """
    Execute Python code in a sandboxed subprocess with resource limits.

    Args:
        code: Python code string to execute
        timeout: Maximum execution time in seconds
        max_memory_mb: Maximum memory in megabytes
    """
    with tempfile.NamedTemporaryFile(mode='w', suffix='.py', delete=False) as f:
        f.write(code)
        script_path = f.name

    try:
        result = subprocess.run(
            ['python', '-u', script_path],
            capture_output=True,
            text=True,
            timeout=timeout,
            env={
                'PATH': '/usr/bin:/usr/local/bin',
                'HOME': '/tmp',
                'PYTHONDONTWRITEBYTECODE': '1'
            }
        )
        return {
            'stdout': result.stdout,
            'stderr': result.stderr,
            'returncode': result.returncode,
            'timed_out': False
        }
    except subprocess.TimeoutExpired:
        return {
            'stdout': '',
            'stderr': f'Execution timed out after {timeout}s',
            'returncode': -1,
            'timed_out': True
        }
    finally:
        os.unlink(script_path)

# Example usage
result = run_sandboxed("""
import numpy as np
data = np.random.randn(1000)
print(f"Mean: {data.mean():.4f}")
print(f"Std:  {data.std():.4f}")
""", timeout=30, max_memory_mb=256)
print(result['stdout'])
```

## Nix-Based Reproducible Environments

For maximum reproducibility, use Nix to pin every dependency including system libraries:

```nix
# shell.nix for a research project
{ pkgs ? import (fetchTarball {
    url = "https://github.com/NixOS/nixpkgs/archive/nixos-23.11.tar.gz";
  }) {} }:

pkgs.mkShell {
  buildInputs = with pkgs; [
    python311
    python311Packages.numpy
    python311Packages.scipy
    python311Packages.pandas
    python311Packages.matplotlib
    python311Packages.scikit-learn
    R
    rPackages.ggplot2
    rPackages.dplyr
  ];

  shellHook = ''
    echo "Research sandbox activated"
    echo "Python: $(python --version)"
    echo "R: $(R --version | head -1)"
  '';
}
```

```bash
# Enter the reproducible environment
nix-shell shell.nix

# Or use flakes for even better reproducibility
nix develop
```

## Security Best Practices

When running untrusted or third-party code:

1. **Network isolation**: Use `--network=none` in Docker to prevent data exfiltration
2. **Filesystem restrictions**: Mount data as read-only, limit writable paths
3. **Resource caps**: Always set CPU, memory, and time limits
4. **User isolation**: Run as non-root user inside the container
5. **Syscall filtering**: Use seccomp profiles to restrict system calls
6. **Output sanitization**: Validate and sanitize all output before processing

## Integration with CI/CD

Automate research pipeline execution with GitHub Actions:

```yaml
name: Research Pipeline
on:
  push:
    paths: ['src/**', 'data/**']

jobs:
  run-analysis:
    runs-on: ubuntu-latest
    container:
      image: research-sandbox:latest
      options: --cpus 4 --memory 8g
    steps:
      - uses: actions/checkout@v4
      - run: python src/01_preprocess.py
      - run: python src/02_analyze.py
      - run: python src/03_visualize.py
      - uses: actions/upload-artifact@v4
        with:
          name: results
          path: output/
```

This ensures every commit triggers a fresh, sandboxed execution of the full pipeline, catching environment-dependent bugs and ensuring reproducibility.