browser-automation
Browser automation for accessing scientific databases that lack REST APIs. Uses the browser-use Python framework (81k+ GitHub stars) to control a real browser via LLM vision. Enables data extraction from web-only databases like GEPIA2, GeneCards advanced features, COSMIC public data, and journal full-text access. Use as a fallback when curl-based API access fails or when the target database has no programmatic API. Requires pip install browser-use and a Chromium browser.
Best use case
browser-automation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Browser automation for accessing scientific databases that lack REST APIs. Uses the browser-use Python framework (81k+ GitHub stars) to control a real browser via LLM vision. Enables data extraction from web-only databases like GEPIA2, GeneCards advanced features, COSMIC public data, and journal full-text access. Use as a fallback when curl-based API access fails or when the target database has no programmatic API. Requires pip install browser-use and a Chromium browser.
Teams using browser-automation should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/browser-automation/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How browser-automation Compares
| Feature / Agent | browser-automation | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Browser automation for accessing scientific databases that lack REST APIs. Uses the browser-use Python framework (81k+ GitHub stars) to control a real browser via LLM vision. Enables data extraction from web-only databases like GEPIA2, GeneCards advanced features, COSMIC public data, and journal full-text access. Use as a fallback when curl-based API access fails or when the target database has no programmatic API. Requires pip install browser-use and a Chromium browser.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Browser Automation for Scientific Data Collection
Access scientific databases that have no REST API by controlling a real browser programmatically. Uses the browser-use framework (vision-based LLM browser automation).
## When to Use
- Target database has **no REST API** (e.g., GEPIA2, some COSMIC pages)
- curl returns **403/captcha/login required** and the data is publicly viewable in a browser
- Need to **navigate multi-step web forms** (e.g., TIMER2.0 correlation analysis)
- Need to **download files** from web interfaces (e.g., GEO supplementary data)
- API exists but is **severely rate-limited** and web access is faster
**When NOT to use**:
- REST API is available and working → use `curl`
- Data requires paid subscription → do not circumvent paywalls
- Data can be obtained from an alternative open API → prefer the API
---
## Installation Check
Before using browser automation, verify the environment:
```bash
bash: python3 -c "
try:
import browser_use
print('✅ browser-use installed')
except ImportError:
print('❌ browser-use not installed')
print(' Install: pip install browser-use')
import shutil
if shutil.which('chromium') or shutil.which('chromium-browser') or shutil.which('google-chrome'):
print('✅ Chromium/Chrome found')
else:
print('⚠️ No Chromium/Chrome found')
print(' Install: apt-get install chromium-browser (Linux)')
print(' Or: brew install --cask chromium (macOS)')
try:
import playwright
print('✅ Playwright installed')
except ImportError:
print('❌ Playwright not installed')
print(' Install: pip install playwright && python -m playwright install chromium')
"
```
If not installed:
```bash
pip install -q browser-use playwright && python -m playwright install chromium
```
---
## Usage Pattern
### Basic: Extract data from a web page
```python
from browser_use import Agent, Browser, BrowserConfig
from langchain_openai import ChatOpenAI
import asyncio
async def extract_gepia2_data(gene: str, cancer: str):
"""Extract gene expression data from GEPIA2 (no API available)."""
browser = Browser(config=BrowserConfig(headless=True))
llm = ChatOpenAI(model="gpt-4o", api_key=os.environ["OPENAI_API_KEY"])
agent = Agent(
task=f"""Go to http://gepia2.cancer-pku.cn/#analysis
1. Click on 'Expression DIY' in the left menu
2. In the gene input box, type '{gene}'
3. Select '{cancer}' from the cancer type dropdown
4. Click 'Plot' button
5. Wait for the plot to load
6. Extract the median expression values for Tumor and Normal from the plot
7. Return the values as JSON: {{"gene": "{gene}", "cancer": "{cancer}", "tumor_median": X, "normal_median": Y}}
""",
llm=llm,
browser=browser,
)
result = await agent.run()
await browser.close()
return result
result = asyncio.run(extract_gepia2_data("THBS2", "PAAD"))
print(result)
```
### Batch: Collect data across multiple databases
```python
async def collect_multi_source(gene: str):
"""Collect gene info from multiple web-only sources."""
browser = Browser(config=BrowserConfig(headless=True))
llm = ChatOpenAI(model="gpt-4o")
tasks = [
{
"source": "GeneCards",
"url": f"https://www.genecards.org/cgi-bin/carddisp.pl?gene={gene}",
"extract": "Gene summary, aliases, protein class, pathways, diseases"
},
{
"source": "GEPIA2",
"url": "http://gepia2.cancer-pku.cn/#analysis",
"extract": f"Expression of {gene} across TCGA cancer types"
}
]
results = {}
for task in tasks:
agent = Agent(
task=f"Navigate to {task['url']} and extract: {task['extract']}. Return as structured JSON.",
llm=llm,
browser=browser,
)
results[task["source"]] = await agent.run()
await browser.close()
return results
```
---
## Target Database Recipes
### GEPIA2 (no API)
```
Task: Go to http://gepia2.cancer-pku.cn/#analysis
1. Select 'Expression DIY' → 'Box Plot'
2. Enter gene symbol: {GENE}
3. Select cancer types or 'All'
4. Click Plot
5. Extract expression values from the resulting visualization
```
### GeneCards (enhanced data)
```
Task: Navigate to https://www.genecards.org/cgi-bin/carddisp.pl?gene={GENE}
1. Extract: Gene summary paragraph
2. Extract: Protein expression table (tissues)
3. Extract: Pathways & interactions section
4. Extract: Disorders associated section
5. Return all as structured JSON
```
### TIMER2.0 (immune analysis, web-only)
```
Task: Go to http://timer.cistrome.org/
1. Select 'Gene' module
2. Enter gene symbol: {GENE}
3. Select cancer type: {CANCER}
4. Select immune cell types: all
5. Click Submit
6. Extract correlation coefficients and p-values from the result table
```
### HPA (Human Protein Atlas)
```
Task: Navigate to https://www.proteinatlas.org/{ENSEMBL_ID}-{GENE}/pathology
1. Extract cancer expression data table
2. Extract prognostic significance across cancer types
3. Extract immunohistochemistry images metadata
```
---
## Safety and Ethics
1. **Respect robots.txt**: Check before scraping any site
2. **Rate limiting**: Wait 2-5 seconds between page navigations
3. **No credential storage**: Never save login credentials to disk
4. **Public data only**: Do not circumvent paywalls or access restrictions
5. **Attribution**: Record the source URL and access date for every data extraction
6. **Minimize requests**: Cache extracted data in the project `data/` directory
---
## Integration with Research Recipes
When a recipe step fails due to API unavailability:
```
curl API call for [DATABASE] failed (404/no API).
Attempting browser-based extraction via browser-use...
```
The browser fallback should:
1. Try the browser approach
2. If browser-use is not installed, suggest installation
3. If the browser approach also fails, document what was attempted and move on
---
## Limitations
- Requires a display server or headless Chromium (may not work in minimal Docker containers)
- Slower than API calls (5-30 seconds per page vs <1 second for curl)
- Vision-based extraction may misread complex layouts
- Some sites actively block automation (detect and skip gracefully)
- Requires an LLM API key for the browser agent (uses GPT-4o by default)Related Skills
zinc-database
Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.
zarr-python
Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.
Academic Writing
## Overview
scientific-visualization
## Overview
venue-templates
Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.
vaex
Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that do not fit in memory.
uspto-database
Access USPTO APIs for patent/trademark searches, examination history (PEDS), assignments, citations, office actions, TSDR, for IP analysis and prior art searches.
uniprot-database
Direct REST API access to UniProt. Protein searches, FASTA retrieval, ID mapping, Swiss-Prot/TrEMBL. For Python workflows with multiple databases, prefer bioservices (unified interface to 40+ services). Use this for direct HTTP/REST work or UniProt-specific control.
umap-learn
UMAP dimensionality reduction. Fast nonlinear manifold learning for 2D/3D visualization, clustering preprocessing (HDBSCAN), supervised/parametric UMAP, for high-dimensional data.
treatment-plans
Generate concise (3-4 page), focused medical treatment plans in LaTeX/PDF format for all clinical specialties. Supports general medical treatment, rehabilitation therapy, mental health care, chronic disease management, perioperative care, and pain management. Includes SMART goal frameworks, evidence-based interventions with minimal text citations, regulatory compliance (HIPAA), and professional formatting. Prioritizes brevity and clinical actionability.
transformers
This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.
torchdrug
PyTorch-native graph neural networks for molecules and proteins. Use when building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, retrosynthesis. For pre-trained models and diverse featurizers use deepchem; for benchmark datasets use pytdc.