orthogonal-exa
Neural web search - find similar content, extract pages, and run deep research
Best use case
orthogonal-exa is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Neural web search - find similar content, extract pages, and run deep research
Teams using orthogonal-exa should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/orthogonal-exa/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How orthogonal-exa Compares
| Feature / Agent | orthogonal-exa | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Neural web search - find similar content, extract pages, and run deep research
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Exa - Neural Web Search & Research
## Setup
Read your credentials from ~/.gooseworks/credentials.json:
```bash
export GOOSEWORKS_API_KEY=$(python3 -c "import json;print(json.load(open('$HOME/.gooseworks/credentials.json'))['api_key'])")
export GOOSEWORKS_API_BASE=$(python3 -c "import json;print(json.load(open('$HOME/.gooseworks/credentials.json')).get('api_base','https://api.gooseworks.ai'))")
```
If ~/.gooseworks/credentials.json does not exist, tell the user to run: `npx gooseworks login`
All endpoints use Bearer auth: `-H "Authorization: Bearer $GOOSEWORKS_API_KEY"`
Neural search engine for finding similar content, extracting pages, and deep research.
## Capabilities
- **Exa Research**: Retrieve a paginated list of your research tasks
- **Answer**: Get an LLM answer to a question informed by Exa search results
- **Search**: The search endpoint lets you intelligently search the web and extract contents from the results
- **Get a task**: Retrieve the status and results of a previously created research task
- **Find similar links**: Find similar links to the link provided and optionally return the contents of the pages
- **Create a task**: Create an asynchronous research task that explores the web, gathers sources, synthesizes findings, and returns results with citations
- **Get contents**: Get the full page contents, summaries, and metadata for a list of URLs
## Usage
### Exa Research
Retrieve a paginated list of your research tasks. The response follows a cursor-based pagination pattern. Pass the `limit` parameter to control page size (max 50) and use the `cursor` token returned in the response to fetch subsequent pages.
Parameters:
- cursor (string) - The cursor to paginate through the results Minimum string length: `1`
- limit (number) - Number of results per page (1-50) Required range: `1 <= x <= 50`
```bash
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/run \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"api":"exa","path":"/research/v1"}'
```
### Answer
Get an LLM answer to a question informed by Exa search results. /answer performs an Exa search and uses an LLM to generate either:
A direct answer for specific queries. (i.e.
Parameters:
- query* (string) - The question or query to answer.
- stream (boolean) - If true, the response is returned as a server-sent events (SSS) stream.
- text (boolean) - If true, the response includes full text content in the search results
```bash
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/run \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"api":"exa","path":"/answer","body":{"query":"What are the best practices for prompt engineering?"}}'
```
### Search
The search endpoint lets you intelligently search the web and extract contents from the results.By default, it automatically chooses the best search method using Exa’s embeddings-based model and other techniques to find the most relevant results for your query.
Parameters:
- query* (string) - The query string for the search.
- additionalQueries (string[]) - Additional query variations for deep search. Only works with type="deep". When provided, these queries are used alongside the main query for comprehensive results.
- type (enum<string>) - The type of search. Neural uses an embeddings-based model, auto (default) intelligently combines neural and other search methods, fast uses streamlined versions of the search models, and deep provides comprehensive search with query expansion and detailed context.
- category (enum<string>) - A data category to focus on. The people and company categories have improved quality for finding LinkedIn profiles and company pages. Note: The company and people categories only support a limited set of filters. The following parameters are NOT supported for these categories: startPublishedDate, endPublishedDate, startCrawlDate, endCrawlDate, includeText, excludeText, excludeDomains. For people category, includeDomains only accepts LinkedIn domains. Using unsupported parameters will result in a 400 error.
- userLocation (string) - The two-letter ISO country code of the user, e.g. US.
- numResults (integer) - Number of results to return. Limits vary by search type: With "neural": max 100 results With "deep": max 100 results If you want to increase the num results beyond these limits, contact sales (hello@exa.ai)
- includeDomains (string[]) - List of domains to include in the search. If specified, results will only come from these domains.
- excludeDomains (string[]) - List of domains to exclude from search results. If specified, no results will be returned from these domains.
- startCrawlDate (string<date-time>) - Crawl date refers to the date that Exa discovered a link. Results will include links that were crawled after this date. Must be specified in ISO 8601 format.
- endCrawlDate (string<date-time>) - Crawl date refers to the date that Exa discovered a link. Results will include links that were crawled before this date. Must be specified in ISO 8601 format.
- startPublishedDate (string<date-time>) - Only links with a published date after this will be returned. Must be specified in ISO 8601 format.
- endPublishedDate (string<date-time>) - Only links with a published date before this will be returned. Must be specified in ISO 8601 format.
- includeText (string[]) - List of strings that must be present in webpage text of results. Currently, only 1 string is supported, of up to 5 words.
- excludeText (string[]) - List of strings that must not be present in webpage text of results. Currently, only 1 string is supported, of up to 5 words. Checks from the first 1000 words of the webpage text.
- context (string) - Return page contents as a context string for LLM. When true, combines all result contents into one string. We recommend using 10000+ characters for best results, though no limit works best. Context strings often perform better than highlights for RAG applications.
- moderation (boolean) - Enable content moderation to filter unsafe content from search results.
- contents (object)
```bash
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/run \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"api":"exa","path":"/search"}'
"query": "startups building AI coding assistants",
"num_results": 10,
"contents": {"text": true}
}'
```
### Get a task
Retrieve the status and results of a previously created research task.Use the unique researchId returned from POST /research/v1 to poll until the task is finished.
Parameters:
- stream (string) - Set to "true" to receive real-time updates via Server-Sent Events (SSE)
- events (string) - Set to "true" to include the detailed event log of all operations performed
```bash
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/run \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"api":"exa","path":"/research/v1/{researchId}"}'
```
### Find similar links
Find similar links to the link provided and optionally return the contents of the pages.
Parameters:
- url* (string) - The url for which you would like to find similar links.
- numResults (integer) - Number of results to return. Limits vary by search type: With "neural": max 100 results With "deep": max 100 results If you want to increase the num results beyond these limits, contact sales (hello@exa.ai)
- includeDomains (string[]) - List of domains to include in the search. If specified, results will only come from these domains.
- excludeDomains (string[]) - List of domains to exclude from search results. If specified, no results will be returned from these domains.
- startCrawlDate (string<date-time>) - Crawl date refers to the date that Exa discovered a link. Results will include links that were crawled after this date. Must be specified in ISO 8601 format.
- endCrawlDate (string<date-time>) - Crawl date refers to the date that Exa discovered a link. Results will include links that were crawled before this date. Must be specified in ISO 8601 format.
- startPublishedDate (string<date-time>) - Only links with a published date after this will be returned. Must be specified in ISO 8601 format.
- endPublishedDate (string<date-time>) - Only links with a published date before this will be returned. Must be specified in ISO 8601 format.
- includeText (string[]) - List of strings that must be present in webpage text of results. Currently, only 1 string is supported, of up to 5 words.
- excludeText (string[]) - List of strings that must not be present in webpage text of results. Currently, only 1 string is supported, of up to 5 words. Checks from the first 1000 words of the webpage text.
- context (string) - Return page contents as a context string for LLM. When true, combines all result contents into one string. We recommend using 10000+ characters for best results, though no limit works best. Context strings often perform better than highlights for RAG applications.
- moderation (boolean) - Enable content moderation to filter unsafe content from search results.
- contents (object)
```bash
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/run \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"api":"exa","path":"/findSimilar"}'
"url": "https://example.com/article",
"num_results": 10
}'
```
### Create a task
Create an asynchronous research task that explores the web, gathers sources, synthesizes findings, and returns results with citations.
Parameters:
- instructions* (string) - Instructions for what you would like research on. A good prompt clearly defines what information you want to find, how research should be conducted, and what the output should look like.
- model (enum<string>) - Research model to use. exa-research is faster and cheaper, while exa-research-pro provides more thorough analysis and stronger reasoning.
- outputSchema (object) - JSON Schema to enforce structured output. When provided, the research output will be validated against this schema and returned as parsed JSON.
```bash
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/run \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"api":"exa","path":"/research/v1","body":{"instructions":"Research the current state of AI coding assistants"}}'
```
### Get contents
Get the full page contents, summaries, and metadata for a list of URLs.Returns instant results from our cache, with automatic live crawling as fallback for uncached pages.
Parameters:
- urls* (string[]) - Array of URLs to crawl (backwards compatible with 'ids' parameter).
- ids (string[]) - Deprecated - use 'urls' instead. Array of document IDs obtained from searches.
- text (string) - If true, returns full page text with default settings. If false, disables text return.
- highlights (object) - Text snippets the LLM identifies as most relevant from each page.
- summary (object) - Summary of the webpage
- livecrawl (enum<string>) - Options for livecrawling pages.'never': Disable livecrawling (default for neural search).'fallback': Livecrawl when cache is empty.'preferred': Always try to livecrawl, but fall back to cache if crawling fails.'always': Always live-crawl, never use cache. Only use if you cannot tolerate any cached content. This option is not recommended unless consulted with the Exa team.
- livecrawlTimeout (integer) - The timeout for livecrawling in milliseconds.
- subpages (integer) - The number of subpages to crawl. The actual number crawled may be limited by system constraints.
- subpageTarget (string) - Term to find specific subpages of search results. Can be a single string or an array of strings, comma delimited.
- extras (object) - Extra parameters to pass.
- context (string) - Return page contents as a context string for LLM. When true, combines all result contents into one string. We recommend using 10000+ characters for best results, though no limit works best. Context strings often perform better than highlights for RAG applications.
```bash
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/run \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"api":"exa","path":"/contents"}'
"ids": ["https://example.com"],
"text": true,
"summary": true
}'
```
## Use Cases
1. **Competitive Research**: Find companies similar to competitors
2. **Content Discovery**: Find related articles and resources
3. **Market Research**: Discover companies in specific niches
4. **Fact-Finding**: Get sourced answers to questions
5. **Deep Research**: Comprehensive research on complex topics
## Discover More
For full endpoint details and parameters:
```bash
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/search \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"prompt":"exa API endpoints"}' List all endpoints
curl -s -X POST $GOOSEWORKS_API_BASE/v1/proxy/orthogonal/details \
-H "Authorization: Bearer $GOOSEWORKS_API_KEY" \
-H "Content-Type: application/json" \
-d '{"api":"exa","path":"/research"}' # Get endpoint details
```Related Skills
orthogonal-yc-batch-evaluator
Evaluate YC batch companies for investment — scrapes the YC directory, researches each company and its founders (work history, LinkedIn, website), assesses founder-company fit, and exports to Google Sheets with priority rankings. Use when asked to evaluate YC companies, research a YC batch, screen startups, or do due diligence on YC companies.
orthogonal-website-screenshot
Take screenshots of websites and web pages
orthogonal-weather
Get current weather and forecasts using free APIs (no API key required). Use when asked about weather, temperature, forecasts, or climate conditions for any location.
orthogonal-weather-forecast
Get weather forecasts - temperature, precipitation, wind, and conditions
orthogonal-vhs-terminal-recordings
Create polished terminal GIF recordings using VHS (Video Hardware Software) by Charmbracelet. Use when asked to create terminal demos, CLI gifs, command-line recordings, or animated terminal screenshots for documentation, READMEs, or marketing.
orthogonal-verify-email
Verify if an email address is valid and deliverable
orthogonal-valyu
Web search, AI answers, content extraction, and async deep research
orthogonal-uptime-monitor
Monitor website uptime - check availability, response times, and status
orthogonal-twitter-profile-lookup
Look up Twitter/X profiles - get bio, followers, tweets, and engagement
orthogonal-tomba
Email finder and verifier - find emails from domains, LinkedIn, or company search
orthogonal-tiktok-search
Search TikTok - find profiles, videos, hashtags, and trending content
orthogonal-textbelt
Send SMS messages programmatically - simple HTTP API for text messaging