customer-discovery
Discover all customers of a given company by scanning websites, case studies, review sites, press, social media, job postings, and more. Use when you need competitive intelligence on who a company sells to.
Best use case
customer-discovery is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Discover all customers of a given company by scanning websites, case studies, review sites, press, social media, job postings, and more. Use when you need competitive intelligence on who a company sells to.
Teams using customer-discovery should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/customer-discovery/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How customer-discovery Compares
| Feature / Agent | customer-discovery | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Discover all customers of a given company by scanning websites, case studies, review sites, press, social media, job postings, and more. Use when you need competitive intelligence on who a company sells to.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
SKILL.md Source
# Customer Discovery
Find all customers of a company by scanning multiple public data sources. Produces a deduplicated report with confidence scoring.
## Quick Start
```
Find all customers of Datadog
```
```
Who are Notion's customers? Use deep mode.
```
## Inputs
| Input | Required | Default | Description |
|-------|----------|---------|-------------|
| Company name | Yes | — | The company to research |
| Website URL | No | Auto-detected | The company's website URL |
| Depth | No | standard | `quick`, `standard`, or `deep` |
## Procedure
### Step 1: Gather Inputs
Ask the user for:
1. **Company name** (required)
2. **Company website URL** (optional — if not provided, WebSearch for it)
3. **Depth tier** — present these options, default to Standard:
- **Quick** (~2-3 min): Website logos, case studies, G2 reviews, press search
- **Standard** (~5-8 min): Quick + blog posts, Wayback Machine, LinkedIn, Twitter, Reddit, HN, job postings, YouTube
- **Deep** (~10-15 min): Standard + SEC filings, podcasts, GitHub, integration directories, BuiltWith, Crunchbase
### Step 2: Create Output Directory
```bash
mkdir -p customer-discovery-[company-slug]
```
### Step 3: Run Sources for Selected Tier
Collect all results into a running list. For each customer found, record:
- **name**: Company name
- **confidence**: high / medium / low
- **source_type**: e.g., "logo_wall", "case_study", "g2_review", "press", "job_posting"
- **evidence_url**: URL where the evidence was found
- **notes**: Brief description of the evidence
#### Quick Sources
**1. Website logo wall**
Run the scrape_website_logos.py script:
```bash
python3 skills/capabilities/customer-discovery/scripts/scrape_website_logos.py \
--url "[company-url]" --output json
```
Parse the JSON output and add each result to the customer list.
**2. Case studies page**
Use WebFetch on the company's case studies page (try `/case-studies`, `/customers`, `/resources/case-studies`). Extract customer names from page headings and content.
**3. G2/Capterra reviews**
If the `review-scraper` skill is available, use it to find reviewer companies:
```bash
python3 skills/capabilities/review-scraper/scripts/scrape_reviews.py \
--platform g2 --url "[g2-product-url]" --max-reviews 50 --output json
```
First, WebSearch for the company's G2 page: `site:g2.com "[company]"`. Extract reviewer company names from review author info.
**4. Web search for press**
WebSearch these queries and extract customer mentions from results:
- `"[company]" customer OR "case study" OR partnership`
- `"[company]" "we use" OR "switched to" OR "chose"`
#### Standard Sources (in addition to Quick)
**5. Company blog posts**
WebSearch: `site:[company-domain] customer OR "case study" OR partnership OR "customer story"`
**6. Wayback Machine logos**
Run the scrape_wayback_logos.py script:
```bash
python3 skills/capabilities/customer-discovery/scripts/scrape_wayback_logos.py \
--url "[company-url]" --output json
```
Logos marked `still_present: false` are especially interesting — they indicate former customers.
**7. Founder/exec LinkedIn posts**
WebSearch: `site:linkedin.com "[company]" customer OR "excited to announce" OR "welcome"`
**8. Twitter/X mentions**
WebSearch: `site:twitter.com "[company]" "we use" OR "just switched to" OR "loving"`
**9. Reddit/HN mentions**
WebSearch these queries:
- `site:reddit.com "we use [company]" OR "[company] customer"`
- `site:news.ycombinator.com "[company]" customer OR user`
**10. Job postings**
WebSearch: `"experience with [company]" site:linkedin.com/jobs OR site:greenhouse.io OR site:lever.co`
Companies requiring experience with the product are likely customers.
**11. YouTube testimonials**
WebSearch: `site:youtube.com "[company]" customer OR testimonial OR review`
#### Deep Sources (in addition to Standard)
**12. SEC filings**
WebSearch: `site:sec.gov "[company]"` — Look for mentions in 10-K and 10-Q filings.
**13. Podcast transcripts**
WebSearch: `"[company]" podcast customer OR transcript OR interview`
**14. GitHub usage signals**
WebSearch: `site:github.com "[company-package-name]"` in dependency files, package.json, requirements.txt, etc.
**15. Integration directories**
WebFetch marketplace pages where the company lists integrations:
- Salesforce AppExchange
- Zapier integrations page
- Slack App Directory
- Any marketplace relevant to the company
**16. BuiltWith detection**
```bash
python3 skills/capabilities/customer-discovery/scripts/search_builtwith.py \
--technology "[company-slug]" --max-results 50 --output json
```
**17. Crunchbase**
WebSearch: `site:crunchbase.com "[company]" customers OR partners`
### Step 4: Deduplicate Results
Merge results by company name using fuzzy matching:
- Normalize: lowercase, strip suffixes (Inc, Corp, LLC, Ltd, Co., GmbH)
- Treat "Acme Inc" = "Acme" = "ACME Corp" = "acme.com" as the same company
- When merging, keep the highest confidence level and all evidence URLs
### Step 5: Assign Confidence
Apply these rules:
**High confidence:**
- Logo on current website (from scrape_website_logos.py with confidence "high")
- Published case study or customer story
- Direct quote or testimonial on the company's site
- Official partnership page listing
**Medium confidence:**
- G2/Capterra review (reviewer's company)
- Press article mentioning customer relationship
- Job posting requiring experience with the product
- YouTube testimonial or video review
- Logo found only in Wayback Machine (was on site, now removed)
**Low confidence:**
- Single social media mention (tweet, Reddit post)
- Indirect reference ("heard good things about X")
- BuiltWith detection only (technology on site doesn't mean they're a paying customer)
- HN discussion mention
### Step 6: Generate Report
Create two output files:
**`customer-discovery-[company]/report.md`:**
```markdown
# Customer Discovery: [Company Name]
**Date:** YYYY-MM-DD
**Depth:** quick | standard | deep
**Total customers found:** N
## High Confidence (N)
| Customer | Source | Evidence |
|----------|--------|----------|
| Shopify | Case study | [link] |
| ... | ... | ... |
## Medium Confidence (N)
| Customer | Source | Evidence |
|----------|--------|----------|
| ... | ... | ... |
## Low Confidence (N)
| Customer | Source | Evidence |
|----------|--------|----------|
| ... | ... | ... |
## Sources Scanned
- Website logo wall: [url] — N customers found
- G2 reviews: N reviews analyzed — N companies identified
- Wayback Machine: N snapshots checked — N logos found (N removed)
- Web search: N queries — N mentions
- ...
## Methodology
This report was generated using the customer-discovery skill, which scans
public data sources to identify companies that use [Company Name]. Confidence
levels reflect the strength and directness of the evidence found.
```
**`customer-discovery-[company]/customers.csv`:**
CSV with columns: `company_name,confidence,source_type,evidence_url,notes`
Write the CSV using a code block or Python script.
## Scripts Reference
| Script | Purpose | Key flags |
|--------|---------|-----------|
| `scrape_website_logos.py` | Extract logos from current website | `--url`, `--output json\|summary` |
| `scrape_wayback_logos.py` | Find historical logos via Wayback Machine | `--url`, `--paths`, `--output json\|summary` |
| `search_builtwith.py` | BuiltWith technology detection (deep mode) | `--technology`, `--max-results`, `--output json\|summary` |
All scripts require `requests`: `pip3 install requests`
External skill scripts (use if available):
- `skills/capabilities/review-scraper/scripts/scrape_reviews.py` — G2/Capterra/Trustpilot reviews (requires Apify token)
- `skills/capabilities/linkedin-post-research/scripts/search_posts.py` — LinkedIn post search (requires Crustdata API key)
## Cost
- **Quick / Standard:** Free (uses WebSearch + free APIs like Wayback Machine CDX)
- **Deep:** Mostly free. BuiltWith paid API is optional (`--api-key` flag); free scraping is used by default.
- External skills (review-scraper, linkedin-post-research) may require paid API tokens.Related Skills
lead-discovery
Orchestrator that runs first for lead generation requests. Gathers business context via website analysis or questions, identifies competitors, builds ICP, and routes to signal skills with pre-filled inputs.
voice-of-customer-synthesizer
Aggregate customer feedback from multiple sources — support tickets, NPS comments, Slack messages, G2 reviews, call transcripts, survey responses — into a unified VoC report with theme clustering, sentiment analysis, trend detection, and actionable recommendations for product, marketing, and CS teams. Chains review-scraper for public review data.
customer-win-back-sequencer
For churned accounts, research what has changed since they left — new funding, team growth, competitor dissatisfaction, product updates that address their pain — then assess re-engagement potential and generate a personalized win-back email sequence with timing recommendations. Chains web research and LinkedIn monitoring with email sequence generation.
customer-story-builder
Take raw customer inputs — interview transcripts, survey responses, Slack quotes, support tickets, review excerpts — and generate a structured case study draft with problem/solution/result narrative, pull-quotes, metric callouts, and multi-format outputs (full case study, one-pager, social proof snippet, sales deck slide). Pure reasoning skill. Use when a product marketing team has customer signal but no time to write the story.
linkedin-influencer-discovery
Discover top LinkedIn influencers and voices by topic, industry, follower count, and country. Use when you need to find the top 100 voices in a space, build influencer lists for outreach, or identify thought leaders on LinkedIn.
kol-discovery
Find Key Opinion Leaders (KOLs) in a given domain by combining web research with LinkedIn post search. Given a company/idea and target domain, generates authority keywords, searches LinkedIn posts to find prolific authors with high engagement, and merges with web-researched influencers. Use when someone wants to "find influencers in X space" or "who are the KOLs for Y industry."
signal-detection-pipeline
Detect buying signals from multiple sources, qualify leads, and generate outreach context
seo-content-engine
Build and run an SEO content engine: audit current state, identify gaps, build keyword architecture, generate content calendar, draft content.
outbound-prospecting-engine
End-to-end outbound prospecting: detect intent signals, research companies, find decision-maker contacts, personalize messaging, launch campaign.
event-prospecting-pipeline
Find attendees at conferences/events, research their companies, qualify against ICP, and launch outreach
competitor-monitoring-system
Set up and run ongoing competitive intelligence monitoring for a client. Tracks competitor content, ads, reviews, social, and product moves.
client-packet-engine
Batch client packet generator. Takes company names/URLs, runs intelligence + strategy generation, presents strategies for human selection, executes selected strategies in pitch-packet mode (no live campaigns or paid enrichment), and packages into local delivery packets.