evolve
Evolve SDK development for TypeScript and Python. Use when building applications with Evolve to run AI agents (Claude, Codex, Gemini, Qwen, Kimi, OpenCode) in secure sandboxes. Triggers: (1) Creating Evolve applications, (2) Configuring agents with skills, Composio, MCP servers, (3) Using Swarm abstractions (map, filter, reduce, bestOf/best_of, verify), (4) Building Pipelines, (5) Structured output with schemas, (6) Session management, streaming, observability, (7) Checkpointing, storage & StorageClient, (8) Cost tracking (per-run and per-session spend), (9) Historical sessions & trace download via sessions() client.
About this skill
The Evolve SDK enables AI agents to programmatically create and manage applications that deploy and orchestrate other AI agents (such as Claude, Codex, Gemini, Qwen, Kimi, or OpenCode) within secure, isolated cloud sandboxes. It offers a comprehensive toolkit for configuring agents with specific skills, integrating external services via Composio, and interacting with MCP servers. The skill directs AI agents on how to leverage Evolve for both TypeScript and Python development. Key functionalities include using advanced Swarm abstractions (map, filter, reduce, bestOf/best_of, verify) to manage multi-agent workflows, building robust AI Pipelines, ensuring structured output with schemas, and robust session management with streaming and observability. The SDK also supports checkpointing, storage management via `StorageClient`, historical session access through `sessions() client`, and detailed cost tracking for both individual runs and entire sessions. This skill streamlines the development and deployment of sophisticated AI agent applications by providing a structured and secure environment. It allows AI agents to manage agent lifecycles, handle complex distributed tasks, and maintain full control over execution, observability, and associated costs, thereby facilitating the creation of production-ready AI systems.
Best use case
The primary use case for the Evolve skill is to enable AI agents to develop, deploy, and manage other AI agents within secure, sandboxed environments. It's ideal for developers and organizations aiming to build complex, scalable, and auditable multi-agent applications, especially those requiring structured interactions, distributed processing, or robust session management and cost tracking.
Evolve SDK development for TypeScript and Python. Use when building applications with Evolve to run AI agents (Claude, Codex, Gemini, Qwen, Kimi, OpenCode) in secure sandboxes. Triggers: (1) Creating Evolve applications, (2) Configuring agents with skills, Composio, MCP servers, (3) Using Swarm abstractions (map, filter, reduce, bestOf/best_of, verify), (4) Building Pipelines, (5) Structured output with schemas, (6) Session management, streaming, observability, (7) Checkpointing, storage & StorageClient, (8) Cost tracking (per-run and per-session spend), (9) Historical sessions & trace download via sessions() client.
The user should expect a functional Evolve-based application that efficiently runs and manages AI agents in secure sandboxes, adhering to specified configurations and achieving desired computational or data processing goals.
Practical example
Example input
Develop a Python Evolve application that uses the 'bestOf' swarm abstraction to find the optimal solution among three different agent responses, each running in its own secure sandbox. Ensure the output is structured with a JSON schema and session costs are tracked.
Example output
Successfully developed `my_evolve_app.py`. The application defines a 'bestOf' swarm strategy, integrates three agents, applies the specified JSON schema for output, and includes cost tracking. Ready for deployment and execution. Estimated run cost details available post-execution.
When to use this skill
- When an AI agent needs to build applications that orchestrate other AI agents in secure cloud sandboxes.
- When configuring AI agents with specific skills, Composio integrations, or MCP servers is required.
- When developing multi-agent systems using Swarm abstractions like map, filter, reduce, or best-of patterns.
- When building AI pipelines, managing sessions, tracking costs, or requiring structured output with schemas.
When not to use this skill
- When simply calling a single, self-contained AI agent without the need for orchestration or sandboxing.
- When the task does not involve programming or developing new applications using an SDK.
- When working with a different agent orchestration platform not compatible with Evolve SDK.
- For trivial tasks that don't require advanced features like checkpointing, storage, or detailed cost tracking.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/docs/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How evolve Compares
| Feature / Agent | evolve | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | medium | N/A |
Frequently Asked Questions
What does this skill do?
Evolve SDK development for TypeScript and Python. Use when building applications with Evolve to run AI agents (Claude, Codex, Gemini, Qwen, Kimi, OpenCode) in secure sandboxes. Triggers: (1) Creating Evolve applications, (2) Configuring agents with skills, Composio, MCP servers, (3) Using Swarm abstractions (map, filter, reduce, bestOf/best_of, verify), (4) Building Pipelines, (5) Structured output with schemas, (6) Session management, streaming, observability, (7) Checkpointing, storage & StorageClient, (8) Cost tracking (per-run and per-session spend), (9) Historical sessions & trace download via sessions() client.
How difficult is it to install?
The installation complexity is rated as medium. You can find the installation instructions above.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
Best AI Skills for Claude
Explore the best AI skills for Claude and Claude Code across coding, research, workflow automation, documentation, and agent operations.
SKILL.md Source
# Evolve SDK Build applications that run CLI agents in secure cloud sandboxes. **Repo:** https://github.com/evolving-machines-lab/evolve ## Language Detection Determine the language from (in priority order): 1. **User specification** — if the user states a language, use it 2. **Project signals** — imports, file extensions, package.json vs pyproject.toml 3. **Ask** — if ambiguous, ask the user - **TypeScript** (`@evolvingmachines/sdk`) — read from [references/typescript/](references/typescript/) - **Python** (`evolve-sdk`) — read from [references/python/](references/python/) ## Required Reading Always read these three references **for the detected language** before writing any Evolve code: **TypeScript:** - [01-getting-started.md](references/typescript/01-getting-started.md) — Installation, authentication (Gateway/BYOK), core lifecycle, streaming basics, agent reference table - [02-configuration.md](references/typescript/02-configuration.md) — Sandbox providers, full builder API, agent skills catalog, Composio (1000+ integrations), MCP servers - [03-runtime.md](references/typescript/03-runtime.md) — run(), executeCommand(), upload/download files, session controls, workspace layout, structured output, session management, storage & checkpointing, StorageClient, sessions() client, cost tracking, observability, error handling **Python:** - [01-getting-started.md](references/python/01-getting-started.md) — Installation, authentication (Gateway/BYOK), core lifecycle, streaming basics, agent reference table - [02-configuration.md](references/python/02-configuration.md) — Sandbox providers, full constructor API, agent skills catalog, Composio (1000+ integrations), MCP servers - [03-runtime.md](references/python/03-runtime.md) — run(), execute_command(), upload/download files, session controls, workspace layout, structured output, session management, storage & checkpointing, StorageClient, sessions() client, cost tracking, observability, error handling ## Critical Constraints - **Model names** — Only use exact names from the Agent Reference table. Do not invent or guess model identifiers. - [TS](references/typescript/01-getting-started.md#agent-reference) | [PY](references/python/01-getting-started.md#agent-reference) - **Cleanup** — Always call `kill()` when done. Sandboxes bill until destroyed. - [TS](references/typescript/01-getting-started.md#core-lifecycle) | [PY](references/python/01-getting-started.md#core-lifecycle) ## Additional References Read on demand when the user's task requires them: | When to read | TypeScript | Python | |-------------|-----------|--------| | Building a UI, handling real-time events, parsing tool calls, browser-use | [04-streaming.md](references/typescript/04-streaming.md) | [04-streaming.md](references/python/04-streaming.md) | | Parallel agents (map/filter/reduce/bestOf/verify), Pipeline chaining | [05-swarm-pipeline.md](references/typescript/05-swarm-pipeline.md) | [05-swarm-pipeline.md](references/python/05-swarm-pipeline.md) | ## Topic Index ### Getting Started | Topic | TypeScript | Python | |-------|-----------|--------| | Installation & requirements | [TS](references/typescript/01-getting-started.md#installation) | [PY](references/python/01-getting-started.md#installation) | | Quick start (3 steps) | [TS](references/typescript/01-getting-started.md#quick-start) | [PY](references/python/01-getting-started.md#quick-start) | | Core lifecycle (run, output, kill) | [TS](references/typescript/01-getting-started.md#core-lifecycle) | [PY](references/python/01-getting-started.md#core-lifecycle) | | Streaming basics | [TS](references/typescript/01-getting-started.md#streaming) | [PY](references/python/01-getting-started.md#streaming) | | Gateway vs BYOK mode | [TS](references/typescript/01-getting-started.md#authentication) | [PY](references/python/01-getting-started.md#authentication) | | BYO subscriptions (Claude Max, Codex, Gemini) | [TS](references/typescript/01-getting-started.md#byo-claude-max-subscription) | [PY](references/python/01-getting-started.md#byo-claude-max-subscription) | | Supported agents, models & defaults | [TS](references/typescript/01-getting-started.md#agent-reference) | [PY](references/python/01-getting-started.md#agent-reference) | ### Configuration | Topic | TypeScript | Python | |-------|-----------|--------| | Sandbox providers (E2B, Modal, Daytona) | [TS](references/typescript/02-configuration.md#sandbox-providers) | [PY](references/python/02-configuration.md#sandbox-providers) | | Provider auto-resolution from env | [TS](references/typescript/02-configuration.md#auto-resolution) | [PY](references/python/02-configuration.md#auto-resolution) | | Full builder/constructor API | [TS](references/typescript/02-configuration.md#evolve-instance) | [PY](references/python/02-configuration.md#evolve-instance) | | Agent skills catalog | [TS](references/typescript/02-configuration.md#agent-skills) | [PY](references/python/02-configuration.md#agent-skills) | | Composio (auth paths, tool filtering, types) | [TS](references/typescript/02-configuration.md#composio-tool-router) | [PY](references/python/02-configuration.md#composio-tool-router) | | MCP server config (STDIO / HTTP / SSE) | [TS](references/typescript/02-configuration.md#evolve-instance) | [PY](references/python/02-configuration.md#evolve-instance) | ### Runtime | Topic | TypeScript | Python | |-------|-----------|--------| | run() options (timeout, background, checkpoint) | [TS](references/typescript/03-runtime.md#run) | [PY](references/python/03-runtime.md#run) | | executeCommand() / execute_command() | [TS](references/typescript/03-runtime.md#executecommand) | [PY](references/python/03-runtime.md#execute_command) | | Upload files to sandbox | [TS](references/typescript/03-runtime.md) | [PY](references/python/03-runtime.md) | | Download output files | [TS](references/typescript/03-runtime.md) | [PY](references/python/03-runtime.md) | | Session controls (interrupt, pause, resume, kill) | [TS](references/typescript/03-runtime.md#session-controls) | [PY](references/python/03-runtime.md#session-controls) | | Port forwarding | [TS](references/typescript/03-runtime.md#gethost) | [PY](references/python/03-runtime.md#get_host) | | Workspace filesystem layout | [TS](references/typescript/03-runtime.md) | [PY](references/python/03-runtime.md) | | Structured output (Zod / Pydantic / JSON Schema) | [TS](references/typescript/03-runtime.md#structured-output) | [PY](references/python/03-runtime.md#structured-output) | | Multi-turn conversations | [TS](references/typescript/03-runtime.md#session-management) | [PY](references/python/03-runtime.md#session-management) | | Pause, resume, reconnect, switch sandboxes | [TS](references/typescript/03-runtime.md#session-management) | [PY](references/python/03-runtime.md#session-management) | | Storage & checkpointing (gateway mode) | [TS](references/typescript/03-runtime.md#storage--checkpointing) | [PY](references/python/03-runtime.md#storage--checkpointing) | | StorageClient (list, get, download checkpoints) | [TS](references/typescript/03-runtime.md#listing--browsing-checkpoints) | [PY](references/python/03-runtime.md#listing--browsing-checkpoints) | | Checkpoint lineage & restore | [TS](references/typescript/03-runtime.md#checkpoint-lineage) | [PY](references/python/03-runtime.md#checkpoint-lineage) | | Historical sessions & trace download | [TS](references/typescript/03-runtime.md#historical-sessions--trace-download) | [PY](references/python/03-runtime.md#historical-sessions--trace-download) | | Cost tracking (per-run & per-session spend) | [TS](references/typescript/03-runtime.md#cost-tracking) | [PY](references/python/03-runtime.md#cost-tracking) | | Observability (dashboard + local logs) | [TS](references/typescript/03-runtime.md#observability) | [PY](references/python/03-runtime.md#observability) | | Error handling | [TS](references/typescript/03-runtime.md#error-handling) | [PY](references/python/03-runtime.md#error-handling) | ### Streaming | Topic | TypeScript | Python | |-------|-----------|--------| | Event listeners (content, lifecycle, stdout, stderr) | [TS](references/typescript/04-streaming.md#event-listeners) | [PY](references/python/04-streaming.md#event-listeners) | | LifecycleEvent & LifecycleReason | [TS](references/typescript/04-streaming.md#lifecycleevent) | [PY](references/python/04-streaming.md#lifecycleevent-typeddict-shape) | | OutputEvent & SessionUpdate types | [TS](references/typescript/04-streaming.md#sessionupdate-types) | [PY](references/python/04-streaming.md#event-types-summary) | | Tool events (ToolCall, ToolCallUpdate, ToolKind) | [TS](references/typescript/04-streaming.md#tool-events) | [PY](references/python/04-streaming.md#toolkind-reference) | | Browser-use detection & URL extraction | [TS](references/typescript/04-streaming.md#browseruseresponse) | [PY](references/python/04-streaming.md#browseruseresponse-extraction) | | UI integration example | [TS](references/typescript/04-streaming.md#ui-integration-example) | [PY](references/python/04-streaming.md#ui-integration-example) | ### Swarm & Pipeline | Topic | TypeScript | Python | |-------|-----------|--------| | Swarm setup (config, concurrency, retry) | [TS](references/typescript/05-swarm-pipeline.md) | [PY](references/python/05-swarm-pipeline.md) | | Input types (FileMap, folders) | [TS](references/typescript/05-swarm-pipeline.md#input-types) | [PY](references/python/05-swarm-pipeline.md#input-types) | | bestOf / best_of (N candidates + judge) | [TS](references/typescript/05-swarm-pipeline.md#bestof) | [PY](references/python/05-swarm-pipeline.md#best_of) | | map (parallel processing) | [TS](references/typescript/05-swarm-pipeline.md#map) | [PY](references/python/05-swarm-pipeline.md#map) | | filter (evaluate + threshold) | [TS](references/typescript/05-swarm-pipeline.md#filter) | [PY](references/python/05-swarm-pipeline.md#filter) | | reduce (synthesize many to one) | [TS](references/typescript/05-swarm-pipeline.md#reduce) | [PY](references/python/05-swarm-pipeline.md#reduce) | | verify (quality gate with feedback loop) | [TS](references/typescript/05-swarm-pipeline.md#verify-quality-gate) | [PY](references/python/05-swarm-pipeline.md#verify-quality-gate) | | Result types (SwarmResult, ReduceResult, BestOfResult) | [TS](references/typescript/05-swarm-pipeline.md#result-types) | [PY](references/python/05-swarm-pipeline.md#result-types) | | Chaining operations | [TS](references/typescript/05-swarm-pipeline.md#chaining-operations) | [PY](references/python/05-swarm-pipeline.md#chaining-operations) | | Pipeline (fluent chaining, events, terminal) | [TS](references/typescript/05-swarm-pipeline.md#pipeline) | [PY](references/python/05-swarm-pipeline.md#pipeline) | ## Self-Update Pull the latest skill from the official repo: ```bash git clone --depth 1 --filter=blob:none --sparse https://github.com/evolving-machines-lab/evolve.git /tmp/evolve-update \ && cd /tmp/evolve-update \ && git sparse-checkout set skills/evolve \ && cp -r skills/evolve/* <SKILL_INSTALL_DIR>/evolve/ \ && rm -rf /tmp/evolve-update ``` Replace `<SKILL_INSTALL_DIR>` with the skill installation path (e.g. `~/.claude/skills/`, `~/.codex/skills/`, `~/.gemini/skills/`).
Related Skills
laravel-expert
Senior Laravel Engineer role for production-grade, maintainable, and idiomatic Laravel solutions. Focuses on clean architecture, security, performance, and modern standards (Laravel 10/11+).
debug-nw
Debug a Nativewind v5 setup issue. Walks through common configuration problems with metro, babel, postcss, and dependencies.
Go Production Engineering
You are a Go production engineering expert. Follow this system for every Go project — from architecture decisions through production deployment. Apply phases sequentially for new projects; use individual phases as needed for existing codebases.
Database Engineering Mastery
> Complete database design, optimization, migration, and operations system. From schema design to production monitoring — covers PostgreSQL, MySQL, SQLite, and general SQL patterns.
afrexai-code-reviewer
Enterprise-grade code review agent. Reviews PRs, diffs, or code files for security vulnerabilities, performance issues, error handling gaps, architecture smells, and test coverage. Works with any language, any repo, no dependencies required.
API Documentation Generator
Generate production-ready API documentation from endpoint descriptions. Outputs OpenAPI 3.0, markdown reference docs, and SDK quickstart guides.
bili-rs
Development skill for bili-rs, a Rust CLI tool for Bilibili (B站). Use when implementing features, fixing bugs, or extending the bilibili-cli-rust codebase. Provides architecture conventions, API endpoints, coding patterns, and project-specific constraints. Triggers on tasks involving adding CLI commands, calling Bilibili APIs, handling authentication, implementing output formatting, or working with the layered cli/commands/client/payloads architecture.
Puppeteer
Automate Chrome and Chromium with Puppeteer for scraping, testing, screenshots, and browser workflows.
pharaoh
Codebase knowledge graph with 23 development workflow skills. Query architecture, dependencies, blast radius, dead code, and test coverage via MCP. Requires GitHub App installation (read-only repo access) and OAuth authentication. Connects to external MCP server at mcp.pharaoh.so.
git-commit-helper
Generate standardized git commit messages following Conventional Commits format. Use this skill when the user asks to commit code, write a commit message, or create a git commit. Enforces team conventions for type prefixes, scope naming, message length, and breaking change documentation.
ask-claude
Delegate a task to Claude Code CLI and immediately report the result back in chat. Supports persistent sessions with full context memory. Safe execution: no data exfiltration, no external calls, file operations confined to workspace. Use when the user asks to run Claude, delegate a coding task, continue a previous Claude session, or any task benefiting from Claude Code's tools (file editing, code analysis, bash, etc.).
bnbchain-mcp
Interact with the BNB Chain Model Context Protocol (MCP) server. Blocks, contracts, tokens, NFTs, wallet, Greenfield, and ERC-8004 agent tools. Use npx @bnb-chain/mcp@latest or read the official skill page.