firecrawl-migration-deep-dive

Migrate to Firecrawl from Puppeteer, Playwright, Cheerio, or other scraping tools. Use when replacing custom scraping code with Firecrawl, migrating between scraping APIs, or re-platforming content ingestion pipelines. Trigger with phrases like "migrate to firecrawl", "replace puppeteer with firecrawl", "switch to firecrawl", "firecrawl vs puppeteer", "firecrawl migration".

1,868 stars

Best use case

firecrawl-migration-deep-dive is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Migrate to Firecrawl from Puppeteer, Playwright, Cheerio, or other scraping tools. Use when replacing custom scraping code with Firecrawl, migrating between scraping APIs, or re-platforming content ingestion pipelines. Trigger with phrases like "migrate to firecrawl", "replace puppeteer with firecrawl", "switch to firecrawl", "firecrawl vs puppeteer", "firecrawl migration".

Teams using firecrawl-migration-deep-dive should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/firecrawl-migration-deep-dive/SKILL.md --create-dirs "https://raw.githubusercontent.com/jeremylongshore/claude-code-plugins-plus-skills/main/plugins/saas-packs/firecrawl-pack/skills/firecrawl-migration-deep-dive/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/firecrawl-migration-deep-dive/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How firecrawl-migration-deep-dive Compares

Feature / Agentfirecrawl-migration-deep-diveStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Migrate to Firecrawl from Puppeteer, Playwright, Cheerio, or other scraping tools. Use when replacing custom scraping code with Firecrawl, migrating between scraping APIs, or re-platforming content ingestion pipelines. Trigger with phrases like "migrate to firecrawl", "replace puppeteer with firecrawl", "switch to firecrawl", "firecrawl vs puppeteer", "firecrawl migration".

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Firecrawl Migration Deep Dive

## Current State
!`npm list puppeteer playwright cheerio 2>/dev/null | grep -E "puppeteer|playwright|cheerio" || echo 'No scraping libs found'`

## Overview
Migrate from custom scraping (Puppeteer, Playwright, Cheerio) or competing APIs to Firecrawl. Firecrawl eliminates browser management, anti-bot handling, and JS rendering infrastructure. This skill shows equivalent code for common scraping patterns.

## Migration Comparison

| Feature | Puppeteer/Playwright | Cheerio | Firecrawl |
|---------|---------------------|---------|-----------|
| JS rendering | Manual browser | No | Automatic |
| Anti-bot bypass | DIY (stealth plugin) | No | Built-in |
| Output format | Raw HTML | Parsed HTML | Markdown/JSON/HTML |
| Infrastructure | Browser instances | None | API call |
| Concurrent scraping | Manage browser pool | Simple | Managed by Firecrawl |
| Cost model | Compute (CPU/RAM) | Free | Credits per page |

## Instructions

### Step 1: Replace Puppeteer Single-Page Scrape

```typescript
// BEFORE: Puppeteer (20+ lines, browser management)
import puppeteer from "puppeteer";

async function scrapePuppeteer(url: string) {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto(url, { waitUntil: "networkidle2" });
  const html = await page.content();
  const title = await page.title();
  await browser.close();
  return { html, title };
}

// AFTER: Firecrawl (5 lines, no browser needed)
import FirecrawlApp from "@mendable/firecrawl-js";

const firecrawl = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY! });

async function scrapeFirecrawl(url: string) {
  const result = await firecrawl.scrapeUrl(url, {
    formats: ["markdown"],
    onlyMainContent: true,
    waitFor: 2000,
  });
  return { markdown: result.markdown, title: result.metadata?.title };
}
```

### Step 2: Replace Cheerio HTML Parsing

```typescript
// BEFORE: fetch + cheerio (manual parsing)
import * as cheerio from "cheerio";

async function scrapeCheerio(url: string) {
  const html = await fetch(url).then(r => r.text());
  const $ = cheerio.load(html);
  return {
    title: $("h1").first().text(),
    content: $("main").text(),
    links: $("a").map((_, el) => $(el).attr("href")).get(),
  };
}

// AFTER: Firecrawl with extract (LLM-powered, no CSS selectors)
async function extractFirecrawl(url: string) {
  const result = await firecrawl.scrapeUrl(url, {
    formats: ["extract", "links"],
    extract: {
      schema: {
        type: "object",
        properties: {
          title: { type: "string" },
          content: { type: "string" },
        },
      },
    },
  });
  return {
    title: result.extract?.title,
    content: result.extract?.content,
    links: result.links,
  };
}
```

### Step 3: Replace Crawl Pipeline

```typescript
// BEFORE: Playwright crawler (100+ lines, queue, browser pool)
// - launch browser pool
// - manage visited URLs set
// - extract links, enqueue
// - handle errors per page
// - close browsers on exit

// AFTER: Firecrawl crawl (10 lines)
async function crawlSite(baseUrl: string) {
  const result = await firecrawl.crawlUrl(baseUrl, {
    limit: 100,
    maxDepth: 3,
    includePaths: ["/docs/*", "/api/*"],
    excludePaths: ["/blog/*"],
    scrapeOptions: {
      formats: ["markdown"],
      onlyMainContent: true,
    },
  });

  return result.data?.map(page => ({
    url: page.metadata?.sourceURL,
    title: page.metadata?.title,
    content: page.markdown,
  }));
}
```

### Step 4: Gradual Migration with Adapter Pattern

```typescript
// Adapter interface for gradual migration
interface ScrapeAdapter {
  scrape(url: string): Promise<{ title: string; content: string }>;
  crawl(url: string, maxPages: number): Promise<Array<{ url: string; content: string }>>;
}

class FirecrawlAdapter implements ScrapeAdapter {
  private client: FirecrawlApp;

  constructor() {
    this.client = new FirecrawlApp({ apiKey: process.env.FIRECRAWL_API_KEY! });
  }

  async scrape(url: string) {
    const result = await this.client.scrapeUrl(url, {
      formats: ["markdown"],
      onlyMainContent: true,
    });
    return {
      title: result.metadata?.title || "",
      content: result.markdown || "",
    };
  }

  async crawl(url: string, maxPages: number) {
    const result = await this.client.crawlUrl(url, {
      limit: maxPages,
      scrapeOptions: { formats: ["markdown"], onlyMainContent: true },
    });
    return (result.data || []).map(page => ({
      url: page.metadata?.sourceURL || url,
      content: page.markdown || "",
    }));
  }
}

// Feature flag controlled migration
function getScrapeAdapter(): ScrapeAdapter {
  if (process.env.USE_FIRECRAWL === "true") {
    return new FirecrawlAdapter();
  }
  return new LegacyPuppeteerAdapter();
}
```

### Step 5: Remove Old Dependencies
```bash
set -euo pipefail
# After migration is complete and verified
npm uninstall puppeteer puppeteer-core
npm uninstall playwright @playwright/test
npm uninstall cheerio

# Remove browser downloads
npx playwright uninstall --all 2>/dev/null || true

# Verify no lingering references
grep -r "puppeteer\|playwright\|cheerio" src/ --include="*.ts" || echo "Clean!"
```

## Migration Checklist
- [ ] Install `@mendable/firecrawl-js`
- [ ] Create adapter layer wrapping Firecrawl
- [ ] Replace single-page scrapes with `scrapeUrl`
- [ ] Replace crawl loops with `crawlUrl`
- [ ] Replace HTML parsing with `extract` or markdown
- [ ] Feature flag to switch between old and new
- [ ] Run both in parallel, compare outputs
- [ ] Remove old scraping dependencies
- [ ] Delete browser management code

## Error Handling
| Issue | Cause | Solution |
|-------|-------|----------|
| Different output format | Puppeteer returns HTML, Firecrawl markdown | Adjust downstream consumers |
| Missing CSS selector data | Firecrawl doesn't use selectors | Use `extract` with JSON schema |
| Higher latency for single pages | API call vs local browser | Acceptable trade-off for zero infra |
| Content differences | Different JS wait timing | Tune `waitFor` parameter |

## Resources
- [Firecrawl vs Puppeteer](https://docs.firecrawl.dev/introduction)
- [Firecrawl Scrape Options](https://docs.firecrawl.dev/features/scrape)
- [Advanced Scraping Guide](https://docs.firecrawl.dev/advanced-scraping-guide)

## Next Steps
For advanced troubleshooting, see `firecrawl-advanced-troubleshooting`.

Related Skills

workhuman-upgrade-migration

1868
from jeremylongshore/claude-code-plugins-plus-skills

Workhuman upgrade migration for employee recognition and rewards API. Use when integrating Workhuman Social Recognition, or building recognition workflows with HRIS systems. Trigger: "workhuman upgrade migration".

wispr-upgrade-migration

1868
from jeremylongshore/claude-code-plugins-plus-skills

Wispr Flow upgrade migration for voice-to-text API integration. Use when integrating Wispr Flow dictation, WebSocket streaming, or building voice-powered applications. Trigger: "wispr upgrade migration".

windsurf-upgrade-migration

1868
from jeremylongshore/claude-code-plugins-plus-skills

Upgrade Windsurf IDE, migrate settings from VS Code or Cursor, and handle breaking changes. Use when upgrading Windsurf versions, migrating from another editor, or handling configuration changes after updates. Trigger with phrases like "upgrade windsurf", "windsurf update", "migrate to windsurf", "windsurf from cursor", "windsurf from vscode".

windsurf-migration-deep-dive

1868
from jeremylongshore/claude-code-plugins-plus-skills

Migrate to Windsurf from VS Code, Cursor, or other AI IDEs with full configuration transfer. Use when migrating a team to Windsurf, transferring Cursor rules, or evaluating Windsurf against other AI editors. Trigger with phrases like "migrate to windsurf", "switch to windsurf", "windsurf from cursor", "windsurf from copilot", "windsurf evaluation".

webflow-upgrade-migration

1868
from jeremylongshore/claude-code-plugins-plus-skills

Analyze, plan, and execute Webflow SDK upgrades (webflow-api v1 to v3) with breaking change detection, API v1-to-v2 migration, and deprecation handling. Trigger with phrases like "upgrade webflow", "webflow migration", "webflow breaking changes", "update webflow SDK", "webflow v1 to v2".

webflow-migration-deep-dive

1868
from jeremylongshore/claude-code-plugins-plus-skills

Execute major Webflow migrations — from other CMS platforms to Webflow CMS, between Webflow sites, or large-scale content re-architecture using the Data API v2 bulk endpoints, strangler fig pattern, and data validation. Trigger with phrases like "migrate to webflow", "webflow migration", "import into webflow", "webflow replatform", "move content to webflow", "webflow bulk import", "wordpress to webflow".

vercel-upgrade-migration

1868
from jeremylongshore/claude-code-plugins-plus-skills

Upgrade Vercel CLI, Node.js runtime, and Next.js framework versions with breaking change detection. Use when upgrading Vercel CLI versions, migrating Node.js runtimes, or updating Next.js between major versions on Vercel. Trigger with phrases like "upgrade vercel", "vercel migration", "vercel breaking changes", "update vercel CLI", "next.js upgrade on vercel".

vercel-migration-deep-dive

1868
from jeremylongshore/claude-code-plugins-plus-skills

Migrate to Vercel from other platforms or re-architecture existing Vercel deployments. Use when migrating from Netlify, AWS, or Cloudflare to Vercel, or when re-platforming an existing Vercel application. Trigger with phrases like "migrate to vercel", "vercel migration", "switch to vercel", "netlify to vercel", "aws to vercel", "vercel replatform".

veeva-upgrade-migration

1868
from jeremylongshore/claude-code-plugins-plus-skills

Veeva Vault upgrade migration for REST API and clinical operations. Use when working with Veeva Vault document management and CRM. Trigger: "veeva upgrade migration".

veeva-migration-deep-dive

1868
from jeremylongshore/claude-code-plugins-plus-skills

Veeva Vault migration deep dive for enterprise operations. Use when implementing advanced Veeva Vault patterns. Trigger: "veeva migration deep dive".

vastai-upgrade-migration

1868
from jeremylongshore/claude-code-plugins-plus-skills

Upgrade Vast.ai CLI, migrate API versions, and handle breaking changes. Use when upgrading vastai CLI, detecting deprecations, or migrating between API versions. Trigger with phrases like "upgrade vastai", "vastai migration", "vastai breaking changes", "update vastai CLI".

vastai-migration-deep-dive

1868
from jeremylongshore/claude-code-plugins-plus-skills

Migrate GPU workloads to or from Vast.ai, or between GPU providers. Use when switching from AWS/GCP/Azure GPU instances to Vast.ai, migrating between GPU types, or re-platforming ML infrastructure. Trigger with phrases like "migrate to vastai", "vastai migration", "switch to vastai", "vastai from aws", "vastai from lambda".