apify-sdk-patterns
Production-ready patterns for Apify SDK and apify-client in TypeScript. Use when building Actors with Crawlee, managing datasets/KV stores, or implementing robust client wrappers with retry and validation. Trigger: "apify SDK patterns", "apify best practices", "apify client wrapper", "crawlee patterns", "idiomatic apify".
Best use case
apify-sdk-patterns is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Production-ready patterns for Apify SDK and apify-client in TypeScript. Use when building Actors with Crawlee, managing datasets/KV stores, or implementing robust client wrappers with retry and validation. Trigger: "apify SDK patterns", "apify best practices", "apify client wrapper", "crawlee patterns", "idiomatic apify".
Teams using apify-sdk-patterns should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/apify-sdk-patterns/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How apify-sdk-patterns Compares
| Feature / Agent | apify-sdk-patterns | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Production-ready patterns for Apify SDK and apify-client in TypeScript. Use when building Actors with Crawlee, managing datasets/KV stores, or implementing robust client wrappers with retry and validation. Trigger: "apify SDK patterns", "apify best practices", "apify client wrapper", "crawlee patterns", "idiomatic apify".
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Apify SDK Patterns
## Overview
Production patterns for both the `apify` SDK (building Actors) and `apify-client` (calling Actors remotely). Covers Crawlee crawler selection, data storage, proxy configuration, and typed client wrappers.
## Prerequisites
- `apify-client` and/or `apify` + `crawlee` installed
- `APIFY_TOKEN` configured
- TypeScript recommended
## Pattern 1: Typed Client Singleton
```typescript
// src/apify/client.ts
import { ApifyClient } from 'apify-client';
let instance: ApifyClient | null = null;
export function getApifyClient(): ApifyClient {
if (!instance) {
const token = process.env.APIFY_TOKEN;
if (!token) throw new Error('APIFY_TOKEN is required');
instance = new ApifyClient({ token });
}
return instance;
}
// Reset for testing
export function resetClient(): void {
instance = null;
}
```
## Pattern 2: Crawlee Crawler Selection
Choose the right crawler for the job:
```typescript
import { CheerioCrawler, PlaywrightCrawler, PuppeteerCrawler } from 'crawlee';
// CHEERIO — Fast, lightweight, no JavaScript rendering
// Use for: static HTML, server-rendered pages, APIs
const cheerioCrawler = new CheerioCrawler({
async requestHandler({ request, $, enqueueLinks }) {
const title = $('title').text();
await Actor.pushData({ url: request.url, title });
await enqueueLinks({ strategy: 'same-domain' });
},
});
// PLAYWRIGHT — Full browser, all engines, modern API
// Use for: SPAs, JavaScript-heavy pages, complex interactions
const playwrightCrawler = new PlaywrightCrawler({
launchContext: { launchOptions: { headless: true } },
async requestHandler({ page, request, enqueueLinks }) {
await page.waitForSelector('h1');
const title = await page.title();
const content = await page.$eval('main', el => el.textContent);
await Actor.pushData({ url: request.url, title, content });
await enqueueLinks({ strategy: 'same-domain' });
},
});
// PUPPETEER — Chromium-only browser automation
// Use for: when you need Chromium specifically or legacy Puppeteer code
const puppeteerCrawler = new PuppeteerCrawler({
async requestHandler({ page, request }) {
const title = await page.title();
await Actor.pushData({ url: request.url, title });
},
});
```
## Pattern 3: Actor Lifecycle with Error Handling
```typescript
import { Actor } from 'apify';
import { CheerioCrawler, log } from 'crawlee';
// Actor.main() wraps init + exit + error handling
await Actor.main(async () => {
const input = await Actor.getInput<{
startUrls: { url: string }[];
maxPages?: number;
proxyConfig?: { useApifyProxy: boolean; groups?: string[] };
}>();
if (!input?.startUrls?.length) {
throw new Error('Input must include at least one startUrl');
}
// Configure proxy if requested
const proxyConfiguration = input.proxyConfig?.useApifyProxy
? await Actor.createProxyConfiguration({
groups: input.proxyConfig.groups,
})
: undefined;
const crawler = new CheerioCrawler({
proxyConfiguration,
maxRequestsPerCrawl: input.maxPages ?? 50,
maxConcurrency: 10,
async requestHandler({ request, $, enqueueLinks }) {
log.info(`Processing ${request.url}`);
await Actor.pushData({
url: request.url,
title: $('title').text().trim(),
h1: $('h1').first().text().trim(),
paragraphs: $('p').map((_, el) => $(el).text().trim()).get(),
});
await enqueueLinks({ strategy: 'same-domain' });
},
async failedRequestHandler({ request }, error) {
log.error(`Request failed: ${request.url}`, { error: error.message });
await Actor.pushData({
url: request.url,
error: error.message,
'#isFailed': true,
});
},
});
await crawler.run(input.startUrls.map(s => s.url));
log.info(`Crawler finished. ${crawler.stats.state.requestsFinished} pages processed.`);
});
```
## Pattern 4: Dataset Operations
```typescript
import { Actor } from 'apify';
import { ApifyClient } from 'apify-client';
// --- Inside an Actor (apify SDK) ---
// Push single item
await Actor.pushData({ url: 'https://example.com', title: 'Example' });
// Push batch
await Actor.pushData([
{ url: 'https://a.com', price: 10 },
{ url: 'https://b.com', price: 20 },
]);
// Store named output in key-value store
await Actor.setValue('SUMMARY', {
totalItems: 100,
avgPrice: 15.50,
crawledAt: new Date().toISOString(),
});
// Get value back
const summary = await Actor.getValue('SUMMARY');
// --- From external app (apify-client) ---
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
// List dataset items with pagination
const { items, total } = await client
.dataset('DATASET_ID')
.listItems({ limit: 1000, offset: 0 });
// Push items to a named dataset
const dataset = await client.datasets().getOrCreate('my-results');
await client.dataset(dataset.id).pushItems([
{ url: 'https://example.com', data: 'scraped content' },
]);
// Download entire dataset
const csv = await client.dataset(dataset.id).downloadItems('csv');
const json = await client.dataset(dataset.id).downloadItems('json');
```
## Pattern 5: Key-Value Store Operations
```typescript
import { ApifyClient } from 'apify-client';
const client = new ApifyClient({ token: process.env.APIFY_TOKEN });
// Create or get a named store
const store = await client.keyValueStores().getOrCreate('my-config');
const storeClient = client.keyValueStore(store.id);
// Set a record (any content type)
await storeClient.setRecord({
key: 'CONFIG',
value: { retries: 3, timeout: 30000 },
contentType: 'application/json',
});
// Get a record
const record = await storeClient.getRecord('CONFIG');
console.log(record?.value); // { retries: 3, timeout: 30000 }
// Store binary data (screenshots, PDFs)
await storeClient.setRecord({
key: 'screenshot.png',
value: screenshotBuffer,
contentType: 'image/png',
});
// List all keys
const { items: keys } = await storeClient.listKeys();
```
## Pattern 6: Proxy Configuration
```typescript
import { Actor } from 'apify';
// Datacenter proxy (included in subscription, fast)
const dcProxy = await Actor.createProxyConfiguration({
groups: ['BUYPROXIES94952'],
});
// Residential proxy (pay per GB, high success rate)
const resProxy = await Actor.createProxyConfiguration({
groups: ['RESIDENTIAL'],
countryCode: 'US',
});
// Google SERP proxy (specialized for Google)
const serpProxy = await Actor.createProxyConfiguration({
groups: ['GOOGLE_SERP'],
});
// Use with any crawler
const crawler = new CheerioCrawler({
proxyConfiguration: dcProxy,
// ...
});
```
## Pattern 7: Router for Multi-Page Actors
```typescript
import { Actor } from 'apify';
import { CheerioCrawler, createCheerioRouter } from 'crawlee';
const router = createCheerioRouter();
// Default route — listing pages
router.addDefaultHandler(async ({ request, $, enqueueLinks }) => {
// Extract links to detail pages
const detailLinks = $('a.product-link')
.map((_, el) => $(el).attr('href'))
.get();
await enqueueLinks({
urls: detailLinks,
label: 'DETAIL',
});
});
// Detail route — individual item pages
router.addHandler('DETAIL', async ({ request, $ }) => {
await Actor.pushData({
url: request.url,
name: $('h1.product-name').text().trim(),
price: parseFloat($('.price').text().replace('$', '')),
description: $('div.description').text().trim(),
});
});
await Actor.main(async () => {
const crawler = new CheerioCrawler({
requestHandler: router,
});
await crawler.run(['https://example-store.com/products']);
});
```
## Pattern 8: Safe Result Wrapper
```typescript
type Result<T> = { data: T; error: null } | { data: null; error: Error };
async function safeActorCall<T>(
client: ApifyClient,
actorId: string,
input: Record<string, unknown>,
): Promise<Result<T[]>> {
try {
const run = await client.actor(actorId).call(input, { timeout: 300 });
if (run.status !== 'SUCCEEDED') {
return { data: null, error: new Error(`Run ${run.status}: ${run.statusMessage}`) };
}
const { items } = await client.dataset(run.defaultDatasetId).listItems();
return { data: items as T[], error: null };
} catch (err) {
return { data: null, error: err as Error };
}
}
// Usage
const result = await safeActorCall<{ url: string; title: string }>(
client, 'apify/web-scraper', { startUrls: [{ url: 'https://example.com' }] }
);
if (result.error) {
console.error('Actor call failed:', result.error.message);
} else {
console.log(`Got ${result.data.length} items`);
}
```
## Error Handling
| Pattern | Use Case | Benefit |
|---------|----------|---------|
| `Actor.main()` | Actor entry point | Auto init/exit + error reporting |
| `failedRequestHandler` | Per-request failures | Log failures without stopping crawl |
| Safe wrapper | External calls | Prevents uncaught exceptions |
| Router | Multi-page scrapes | Clean separation of page types |
| Proxy rotation | Anti-bot sites | Higher success rate |
## Resources
- [Apify SDK Reference](https://docs.apify.com/sdk/js/reference)
- [Crawlee Documentation](https://crawlee.dev/js/docs/quick-start)
- [Apify JS Client Reference](https://docs.apify.com/api/client/js/reference)
- [Proxy Management Guide](https://docs.apify.com/sdk/js/docs/guides/proxy-management)
## Next Steps
Apply patterns in `apify-core-workflow-a` for a complete web scraping workflow.Related Skills
exa-sdk-patterns
Apply production-ready exa-js SDK patterns with type safety, singletons, and wrappers. Use when implementing Exa integrations, refactoring SDK usage, or establishing team coding standards for Exa. Trigger with phrases like "exa SDK patterns", "exa best practices", "exa code patterns", "idiomatic exa", "exa wrapper".
exa-reliability-patterns
Implement Exa reliability patterns: query fallback chains, circuit breakers, and graceful degradation. Use when building fault-tolerant Exa integrations, implementing fallback strategies, or adding resilience to production search services. Trigger with phrases like "exa reliability", "exa circuit breaker", "exa fallback", "exa resilience", "exa graceful degradation".
evernote-sdk-patterns
Advanced Evernote SDK patterns and best practices. Use when implementing complex note operations, batch processing, search queries, or optimizing SDK usage. Trigger with phrases like "evernote sdk patterns", "evernote best practices", "evernote advanced", "evernote batch operations".
elevenlabs-sdk-patterns
Apply production-ready ElevenLabs SDK patterns for TypeScript and Python. Use when implementing ElevenLabs integrations, refactoring SDK usage, or establishing team coding standards for audio AI applications. Trigger: "elevenlabs SDK patterns", "elevenlabs best practices", "elevenlabs code patterns", "idiomatic elevenlabs", "elevenlabs typescript".
documenso-sdk-patterns
Apply production-ready Documenso SDK patterns for TypeScript and Python. Use when implementing Documenso integrations, refactoring SDK usage, or establishing team coding standards for Documenso. Trigger with phrases like "documenso SDK patterns", "documenso best practices", "documenso code patterns", "idiomatic documenso".
deepgram-sdk-patterns
Apply production-ready Deepgram SDK patterns for TypeScript and Python. Use when implementing Deepgram integrations, refactoring SDK usage, or establishing team coding standards for Deepgram. Trigger: "deepgram SDK patterns", "deepgram best practices", "deepgram code patterns", "idiomatic deepgram", "deepgram typescript".
databricks-sdk-patterns
Apply production-ready Databricks SDK patterns for Python and REST API. Use when implementing Databricks integrations, refactoring SDK usage, or establishing team coding standards for Databricks. Trigger with phrases like "databricks SDK patterns", "databricks best practices", "databricks code patterns", "idiomatic databricks".
customerio-sdk-patterns
Apply production-ready Customer.io SDK patterns. Use when implementing typed clients, retry logic, event batching, or singleton management for customerio-node. Trigger: "customer.io best practices", "customer.io patterns", "production customer.io", "customer.io architecture", "customer.io singleton".
customerio-reliability-patterns
Implement Customer.io reliability and fault-tolerance patterns. Use when building circuit breakers, fallback queues, idempotency, or graceful degradation for Customer.io integrations. Trigger: "customer.io reliability", "customer.io resilience", "customer.io circuit breaker", "customer.io fault tolerance".
coreweave-sdk-patterns
Production-ready patterns for CoreWeave GPU workload management with kubectl and Python. Use when building inference clients, managing GPU deployments programmatically, or creating reusable CoreWeave deployment templates. Trigger with phrases like "coreweave patterns", "coreweave client", "coreweave Python", "coreweave deployment template".
cohere-sdk-patterns
Apply production-ready Cohere SDK patterns for TypeScript and Python. Use when implementing Cohere integrations, refactoring SDK usage, or establishing team coding standards for Cohere API v2. Trigger with phrases like "cohere SDK patterns", "cohere best practices", "cohere code patterns", "idiomatic cohere", "cohere wrapper".
coderabbit-sdk-patterns
Apply production-ready CodeRabbit automation patterns using GitHub API and PR comments. Use when building automation around CodeRabbit reviews, processing review feedback programmatically, or integrating CodeRabbit into custom workflows. Trigger with phrases like "coderabbit automation", "coderabbit API patterns", "automate coderabbit", "coderabbit github api", "process coderabbit reviews".