evolve

Evolve SDK development for TypeScript and Python. Use when building applications with Evolve to run AI agents (Claude, Codex, Gemini, Qwen, Kimi, OpenCode) in secure sandboxes. Triggers: (1) Creating Evolve applications, (2) Configuring agents with skills, Composio, MCP servers, (3) Using Swarm abstractions (map, filter, reduce, bestOf/best_of, verify), (4) Building Pipelines, (5) Structured output with schemas, (6) Session management, streaming, observability, (7) Checkpointing, storage & StorageClient, (8) Cost tracking (per-run and per-session spend), (9) Historical sessions & trace download via sessions() client.

56 stars

byevolving-machines-lab

Complexity: medium

View on GitHub Installation ↓

About this skill

The Evolve SDK enables AI agents to programmatically create and manage applications that deploy and orchestrate other AI agents (such as Claude, Codex, Gemini, Qwen, Kimi, or OpenCode) within secure, isolated cloud sandboxes. It offers a comprehensive toolkit for configuring agents with specific skills, integrating external services via Composio, and interacting with MCP servers. The skill directs AI agents on how to leverage Evolve for both TypeScript and Python development. Key functionalities include using advanced Swarm abstractions (map, filter, reduce, bestOf/best_of, verify) to manage multi-agent workflows, building robust AI Pipelines, ensuring structured output with schemas, and robust session management with streaming and observability. The SDK also supports checkpointing, storage management via `StorageClient`, historical session access through `sessions() client`, and detailed cost tracking for both individual runs and entire sessions. This skill streamlines the development and deployment of sophisticated AI agent applications by providing a structured and secure environment. It allows AI agents to manage agent lifecycles, handle complex distributed tasks, and maintain full control over execution, observability, and associated costs, thereby facilitating the creation of production-ready AI systems.

Best use case

The primary use case for the Evolve skill is to enable AI agents to develop, deploy, and manage other AI agents within secure, sandboxed environments. It's ideal for developers and organizations aiming to build complex, scalable, and auditable multi-agent applications, especially those requiring structured interactions, distributed processing, or robust session management and cost tracking.

The user should expect a functional Evolve-based application that efficiently runs and manages AI agents in secure sandboxes, adhering to specified configurations and achieving desired computational or data processing goals.

Practical example

Example input

Develop a Python Evolve application that uses the 'bestOf' swarm abstraction to find the optimal solution among three different agent responses, each running in its own secure sandbox. Ensure the output is structured with a JSON schema and session costs are tracked.

Example output

Successfully developed `my_evolve_app.py`. The application defines a 'bestOf' swarm strategy, integrates three agents, applies the specified JSON schema for output, and includes cost tracking. Ready for deployment and execution. Estimated run cost details available post-execution.

When to use this skill

When an AI agent needs to build applications that orchestrate other AI agents in secure cloud sandboxes.
When configuring AI agents with specific skills, Composio integrations, or MCP servers is required.
When developing multi-agent systems using Swarm abstractions like map, filter, reduce, or best-of patterns.
When building AI pipelines, managing sessions, tracking costs, or requiring structured output with schemas.

When not to use this skill

When simply calling a single, self-contained AI agent without the need for orchestration or sandboxing.
When the task does not involve programming or developing new applications using an SDK.
When working with a different agent orchestration platform not compatible with Evolve SDK.
For trivial tasks that don't require advanced features like checkpointing, storage, or detailed cost tracking.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/docs/SKILL.md --create-dirs "https://raw.githubusercontent.com/evolving-machines-lab/evolve/main/docs/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/docs/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How evolve Compares

Feature / Agent	evolve	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	medium	N/A

Frequently Asked Questions

What does this skill do?

How difficult is it to install?

The installation complexity is rated as medium. You can find the installation instructions above.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

Cursor vs Codex for AI Workflows

Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.

Best AI Skills for Claude

Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.

SKILL.md Source

# Evolve SDK

Build applications that run CLI agents in secure cloud sandboxes.

**Repo:** https://github.com/evolving-machines-lab/evolve

## Language Detection

Determine the language from (in priority order):

1. **User specification** — if the user states a language, use it
2. **Project signals** — imports, file extensions, package.json vs pyproject.toml
3. **Ask** — if ambiguous, ask the user

- **TypeScript** (`@evolvingmachines/sdk`) — read from [references/typescript/](references/typescript/)
- **Python** (`evolve-sdk`) — read from [references/python/](references/python/)

## Required Reading

Always read these three references **for the detected language** before writing any Evolve code:

**TypeScript:**
- [01-getting-started.md](references/typescript/01-getting-started.md) — Installation, authentication (Gateway/BYOK), core lifecycle, streaming basics, agent reference table
- [02-configuration.md](references/typescript/02-configuration.md) — Sandbox providers, full builder API, agent skills catalog, Composio (1000+ integrations), MCP servers
- [03-runtime.md](references/typescript/03-runtime.md) — run(), executeCommand(), upload/download files, session controls, workspace layout, structured output, session management, storage & checkpointing, StorageClient, sessions() client, cost tracking, observability, error handling

**Python:**
- [01-getting-started.md](references/python/01-getting-started.md) — Installation, authentication (Gateway/BYOK), core lifecycle, streaming basics, agent reference table
- [02-configuration.md](references/python/02-configuration.md) — Sandbox providers, full constructor API, agent skills catalog, Composio (1000+ integrations), MCP servers
- [03-runtime.md](references/python/03-runtime.md) — run(), execute_command(), upload/download files, session controls, workspace layout, structured output, session management, storage & checkpointing, StorageClient, sessions() client, cost tracking, observability, error handling

## Critical Constraints

- **Model names** — Only use exact names from the Agent Reference table. Do not invent or guess model identifiers.
  - [TS](references/typescript/01-getting-started.md#agent-reference) | [PY](references/python/01-getting-started.md#agent-reference)
- **Cleanup** — Always call `kill()` when done. Sandboxes bill until destroyed.
  - [TS](references/typescript/01-getting-started.md#core-lifecycle) | [PY](references/python/01-getting-started.md#core-lifecycle)

## Additional References

Read on demand when the user's task requires them:

| When to read | TypeScript | Python |
|-------------|-----------|--------|
| Building a UI, handling real-time events, parsing tool calls, browser-use | [04-streaming.md](references/typescript/04-streaming.md) | [04-streaming.md](references/python/04-streaming.md) |
| Parallel agents (map/filter/reduce/bestOf/verify), Pipeline chaining | [05-swarm-pipeline.md](references/typescript/05-swarm-pipeline.md) | [05-swarm-pipeline.md](references/python/05-swarm-pipeline.md) |

## Topic Index

### Getting Started

| Topic | TypeScript | Python |
|-------|-----------|--------|
| Installation & requirements | [TS](references/typescript/01-getting-started.md#installation) | [PY](references/python/01-getting-started.md#installation) |
| Quick start (3 steps) | [TS](references/typescript/01-getting-started.md#quick-start) | [PY](references/python/01-getting-started.md#quick-start) |
| Core lifecycle (run, output, kill) | [TS](references/typescript/01-getting-started.md#core-lifecycle) | [PY](references/python/01-getting-started.md#core-lifecycle) |
| Streaming basics | [TS](references/typescript/01-getting-started.md#streaming) | [PY](references/python/01-getting-started.md#streaming) |
| Gateway vs BYOK mode | [TS](references/typescript/01-getting-started.md#authentication) | [PY](references/python/01-getting-started.md#authentication) |
| BYO subscriptions (Claude Max, Codex, Gemini) | [TS](references/typescript/01-getting-started.md#byo-claude-max-subscription) | [PY](references/python/01-getting-started.md#byo-claude-max-subscription) |
| Supported agents, models & defaults | [TS](references/typescript/01-getting-started.md#agent-reference) | [PY](references/python/01-getting-started.md#agent-reference) |

### Configuration

| Topic | TypeScript | Python |
|-------|-----------|--------|
| Sandbox providers (E2B, Modal, Daytona) | [TS](references/typescript/02-configuration.md#sandbox-providers) | [PY](references/python/02-configuration.md#sandbox-providers) |
| Provider auto-resolution from env | [TS](references/typescript/02-configuration.md#auto-resolution) | [PY](references/python/02-configuration.md#auto-resolution) |
| Full builder/constructor API | [TS](references/typescript/02-configuration.md#evolve-instance) | [PY](references/python/02-configuration.md#evolve-instance) |
| Agent skills catalog | [TS](references/typescript/02-configuration.md#agent-skills) | [PY](references/python/02-configuration.md#agent-skills) |
| Composio (auth paths, tool filtering, types) | [TS](references/typescript/02-configuration.md#composio-tool-router) | [PY](references/python/02-configuration.md#composio-tool-router) |
| MCP server config (STDIO / HTTP / SSE) | [TS](references/typescript/02-configuration.md#evolve-instance) | [PY](references/python/02-configuration.md#evolve-instance) |

### Runtime

| Topic | TypeScript | Python |
|-------|-----------|--------|
| run() options (timeout, background, checkpoint) | [TS](references/typescript/03-runtime.md#run) | [PY](references/python/03-runtime.md#run) |
| executeCommand() / execute_command() | [TS](references/typescript/03-runtime.md#executecommand) | [PY](references/python/03-runtime.md#execute_command) |
| Upload files to sandbox | [TS](references/typescript/03-runtime.md) | [PY](references/python/03-runtime.md) |
| Download output files | [TS](references/typescript/03-runtime.md) | [PY](references/python/03-runtime.md) |
| Session controls (interrupt, pause, resume, kill) | [TS](references/typescript/03-runtime.md#session-controls) | [PY](references/python/03-runtime.md#session-controls) |
| Port forwarding | [TS](references/typescript/03-runtime.md#gethost) | [PY](references/python/03-runtime.md#get_host) |
| Workspace filesystem layout | [TS](references/typescript/03-runtime.md) | [PY](references/python/03-runtime.md) |
| Structured output (Zod / Pydantic / JSON Schema) | [TS](references/typescript/03-runtime.md#structured-output) | [PY](references/python/03-runtime.md#structured-output) |
| Multi-turn conversations | [TS](references/typescript/03-runtime.md#session-management) | [PY](references/python/03-runtime.md#session-management) |
| Pause, resume, reconnect, switch sandboxes | [TS](references/typescript/03-runtime.md#session-management) | [PY](references/python/03-runtime.md#session-management) |
| Storage & checkpointing (gateway mode) | [TS](references/typescript/03-runtime.md#storage--checkpointing) | [PY](references/python/03-runtime.md#storage--checkpointing) |
| StorageClient (list, get, download checkpoints) | [TS](references/typescript/03-runtime.md#listing--browsing-checkpoints) | [PY](references/python/03-runtime.md#listing--browsing-checkpoints) |
| Checkpoint lineage & restore | [TS](references/typescript/03-runtime.md#checkpoint-lineage) | [PY](references/python/03-runtime.md#checkpoint-lineage) |
| Historical sessions & trace download | [TS](references/typescript/03-runtime.md#historical-sessions--trace-download) | [PY](references/python/03-runtime.md#historical-sessions--trace-download) |
| Cost tracking (per-run & per-session spend) | [TS](references/typescript/03-runtime.md#cost-tracking) | [PY](references/python/03-runtime.md#cost-tracking) |
| Observability (dashboard + local logs) | [TS](references/typescript/03-runtime.md#observability) | [PY](references/python/03-runtime.md#observability) |
| Error handling | [TS](references/typescript/03-runtime.md#error-handling) | [PY](references/python/03-runtime.md#error-handling) |

### Streaming

| Topic | TypeScript | Python |
|-------|-----------|--------|
| Event listeners (content, lifecycle, stdout, stderr) | [TS](references/typescript/04-streaming.md#event-listeners) | [PY](references/python/04-streaming.md#event-listeners) |
| LifecycleEvent & LifecycleReason | [TS](references/typescript/04-streaming.md#lifecycleevent) | [PY](references/python/04-streaming.md#lifecycleevent-typeddict-shape) |
| OutputEvent & SessionUpdate types | [TS](references/typescript/04-streaming.md#sessionupdate-types) | [PY](references/python/04-streaming.md#event-types-summary) |
| Tool events (ToolCall, ToolCallUpdate, ToolKind) | [TS](references/typescript/04-streaming.md#tool-events) | [PY](references/python/04-streaming.md#toolkind-reference) |
| Browser-use detection & URL extraction | [TS](references/typescript/04-streaming.md#browseruseresponse) | [PY](references/python/04-streaming.md#browseruseresponse-extraction) |
| UI integration example | [TS](references/typescript/04-streaming.md#ui-integration-example) | [PY](references/python/04-streaming.md#ui-integration-example) |

### Swarm & Pipeline

| Topic | TypeScript | Python |
|-------|-----------|--------|
| Swarm setup (config, concurrency, retry) | [TS](references/typescript/05-swarm-pipeline.md) | [PY](references/python/05-swarm-pipeline.md) |
| Input types (FileMap, folders) | [TS](references/typescript/05-swarm-pipeline.md#input-types) | [PY](references/python/05-swarm-pipeline.md#input-types) |
| bestOf / best_of (N candidates + judge) | [TS](references/typescript/05-swarm-pipeline.md#bestof) | [PY](references/python/05-swarm-pipeline.md#best_of) |
| map (parallel processing) | [TS](references/typescript/05-swarm-pipeline.md#map) | [PY](references/python/05-swarm-pipeline.md#map) |
| filter (evaluate + threshold) | [TS](references/typescript/05-swarm-pipeline.md#filter) | [PY](references/python/05-swarm-pipeline.md#filter) |
| reduce (synthesize many to one) | [TS](references/typescript/05-swarm-pipeline.md#reduce) | [PY](references/python/05-swarm-pipeline.md#reduce) |
| verify (quality gate with feedback loop) | [TS](references/typescript/05-swarm-pipeline.md#verify-quality-gate) | [PY](references/python/05-swarm-pipeline.md#verify-quality-gate) |
| Result types (SwarmResult, ReduceResult, BestOfResult) | [TS](references/typescript/05-swarm-pipeline.md#result-types) | [PY](references/python/05-swarm-pipeline.md#result-types) |
| Chaining operations | [TS](references/typescript/05-swarm-pipeline.md#chaining-operations) | [PY](references/python/05-swarm-pipeline.md#chaining-operations) |
| Pipeline (fluent chaining, events, terminal) | [TS](references/typescript/05-swarm-pipeline.md#pipeline) | [PY](references/python/05-swarm-pipeline.md#pipeline) |

## Self-Update

Pull the latest skill from the official repo:

```bash
git clone --depth 1 --filter=blob:none --sparse https://github.com/evolving-machines-lab/evolve.git /tmp/evolve-update \
  && cd /tmp/evolve-update \
  && git sparse-checkout set skills/evolve \
  && cp -r skills/evolve/* <SKILL_INSTALL_DIR>/evolve/ \
  && rm -rf /tmp/evolve-update
```

Replace `<SKILL_INSTALL_DIR>` with the skill installation path (e.g. `~/.claude/skills/`, `~/.codex/skills/`, `~/.gemini/skills/`).

Related Skills

laravel-expert

31392

from sickn33/antigravity-awesome-skills

Senior Laravel Engineer role for production-grade, maintainable, and idiomatic Laravel solutions. Focuses on clean architecture, security, performance, and modern standards (Laravel 10/11+).

Coding & DevelopmentClaude

debug-nw

7754

from nativewind/nativewind

Debug a Nativewind v5 setup issue. Walks through common configuration problems with metro, babel, postcss, and dependencies.

Coding & Development

Go Production Engineering

3891

from openclaw/skills

You are a Go production engineering expert. Follow this system for every Go project — from architecture decisions through production deployment. Apply phases sequentially for new projects; use individual phases as needed for existing codebases.

Coding & Development

Database Engineering Mastery

3891

from openclaw/skills

> Complete database design, optimization, migration, and operations system. From schema design to production monitoring — covers PostgreSQL, MySQL, SQLite, and general SQL patterns.

Coding & Development

afrexai-code-reviewer

3891

from openclaw/skills

Enterprise-grade code review agent. Reviews PRs, diffs, or code files for security vulnerabilities, performance issues, error handling gaps, architecture smells, and test coverage. Works with any language, any repo, no dependencies required.

Coding & Development

API Documentation Generator

3891

from openclaw/skills

Generate production-ready API documentation from endpoint descriptions. Outputs OpenAPI 3.0, markdown reference docs, and SDK quickstart guides.

Coding & Development

bili-rs

3891

from openclaw/skills

Development skill for bili-rs, a Rust CLI tool for Bilibili (B站). Use when implementing features, fixing bugs, or extending the bilibili-cli-rust codebase. Provides architecture conventions, API endpoints, coding patterns, and project-specific constraints. Triggers on tasks involving adding CLI commands, calling Bilibili APIs, handling authentication, implementing output formatting, or working with the layered cli/commands/client/payloads architecture.

Coding & Development

Puppeteer

3891

from openclaw/skills

Automate Chrome and Chromium with Puppeteer for scraping, testing, screenshots, and browser workflows.

Coding & Development

pharaoh

3891

from openclaw/skills

Codebase knowledge graph with 23 development workflow skills. Query architecture, dependencies, blast radius, dead code, and test coverage via MCP. Requires GitHub App installation (read-only repo access) and OAuth authentication. Connects to external MCP server at mcp.pharaoh.so.

Coding & Development

git-commit-helper

3891

from openclaw/skills

Generate standardized git commit messages following Conventional Commits format. Use this skill when the user asks to commit code, write a commit message, or create a git commit. Enforces team conventions for type prefixes, scope naming, message length, and breaking change documentation.

Coding & Development

ask-claude

3891

from openclaw/skills

Delegate a task to Claude Code CLI and immediately report the result back in chat. Supports persistent sessions with full context memory. Safe execution: no data exfiltration, no external calls, file operations confined to workspace. Use when the user asks to run Claude, delegate a coding task, continue a previous Claude session, or any task benefiting from Claude Code's tools (file editing, code analysis, bash, etc.).

Coding & Development

bnbchain-mcp

3891

from openclaw/skills

Interact with the BNB Chain Model Context Protocol (MCP) server. Blocks, contracts, tokens, NFTs, wallet, Greenfield, and ERC-8004 agent tools. Use npx @bnb-chain/mcp@latest or read the official skill page.

Coding & Development