browser-automation

Browser automation for accessing scientific databases that lack REST APIs. Uses the browser-use Python framework (81k+ GitHub stars) to control a real browser via LLM vision. Enables data extraction from web-only databases like GEPIA2, GeneCards advanced features, COSMIC public data, and journal full-text access. Use as a fallback when curl-based API access fails or when the target database has no programmatic API. Requires pip install browser-use and a Chromium browser.

42 stars

byZaoqu-Liu

View on GitHub Installation ↓

Best use case

browser-automation is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using browser-automation should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/browser-automation/SKILL.md --create-dirs "https://raw.githubusercontent.com/Zaoqu-Liu/ScienceClaw/main/skills/browser-automation/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/browser-automation/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How browser-automation Compares

Feature / Agent	browser-automation	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Browser Automation for Scientific Data Collection

Access scientific databases that have no REST API by controlling a real browser programmatically. Uses the browser-use framework (vision-based LLM browser automation).

## When to Use

- Target database has **no REST API** (e.g., GEPIA2, some COSMIC pages)
- curl returns **403/captcha/login required** and the data is publicly viewable in a browser
- Need to **navigate multi-step web forms** (e.g., TIMER2.0 correlation analysis)
- Need to **download files** from web interfaces (e.g., GEO supplementary data)
- API exists but is **severely rate-limited** and web access is faster

**When NOT to use**:
- REST API is available and working → use `curl`
- Data requires paid subscription → do not circumvent paywalls
- Data can be obtained from an alternative open API → prefer the API

---

## Installation Check

Before using browser automation, verify the environment:

```bash
bash: python3 -c "
try:
    import browser_use
    print('✅ browser-use installed')
except ImportError:
    print('❌ browser-use not installed')
    print('   Install: pip install browser-use')

import shutil
if shutil.which('chromium') or shutil.which('chromium-browser') or shutil.which('google-chrome'):
    print('✅ Chromium/Chrome found')
else:
    print('⚠️  No Chromium/Chrome found')
    print('   Install: apt-get install chromium-browser (Linux)')
    print('   Or: brew install --cask chromium (macOS)')

try:
    import playwright
    print('✅ Playwright installed')
except ImportError:
    print('❌ Playwright not installed')
    print('   Install: pip install playwright && python -m playwright install chromium')
"
```

If not installed:
```bash
pip install -q browser-use playwright && python -m playwright install chromium
```

---

## Usage Pattern

### Basic: Extract data from a web page

```python
from browser_use import Agent, Browser, BrowserConfig
from langchain_openai import ChatOpenAI
import asyncio

async def extract_gepia2_data(gene: str, cancer: str):
    """Extract gene expression data from GEPIA2 (no API available)."""
    browser = Browser(config=BrowserConfig(headless=True))
    llm = ChatOpenAI(model="gpt-4o", api_key=os.environ["OPENAI_API_KEY"])

    agent = Agent(
        task=f"""Go to http://gepia2.cancer-pku.cn/#analysis
        1. Click on 'Expression DIY' in the left menu
        2. In the gene input box, type '{gene}'
        3. Select '{cancer}' from the cancer type dropdown
        4. Click 'Plot' button
        5. Wait for the plot to load
        6. Extract the median expression values for Tumor and Normal from the plot
        7. Return the values as JSON: {{"gene": "{gene}", "cancer": "{cancer}", "tumor_median": X, "normal_median": Y}}
        """,
        llm=llm,
        browser=browser,
    )

    result = await agent.run()
    await browser.close()
    return result

result = asyncio.run(extract_gepia2_data("THBS2", "PAAD"))
print(result)
```

### Batch: Collect data across multiple databases

```python
async def collect_multi_source(gene: str):
    """Collect gene info from multiple web-only sources."""
    browser = Browser(config=BrowserConfig(headless=True))
    llm = ChatOpenAI(model="gpt-4o")

    tasks = [
        {
            "source": "GeneCards",
            "url": f"https://www.genecards.org/cgi-bin/carddisp.pl?gene={gene}",
            "extract": "Gene summary, aliases, protein class, pathways, diseases"
        },
        {
            "source": "GEPIA2",
            "url": "http://gepia2.cancer-pku.cn/#analysis",
            "extract": f"Expression of {gene} across TCGA cancer types"
        }
    ]

    results = {}
    for task in tasks:
        agent = Agent(
            task=f"Navigate to {task['url']} and extract: {task['extract']}. Return as structured JSON.",
            llm=llm,
            browser=browser,
        )
        results[task["source"]] = await agent.run()

    await browser.close()
    return results
```

---

## Target Database Recipes

### GEPIA2 (no API)

```
Task: Go to http://gepia2.cancer-pku.cn/#analysis
1. Select 'Expression DIY' → 'Box Plot'
2. Enter gene symbol: {GENE}
3. Select cancer types or 'All'
4. Click Plot
5. Extract expression values from the resulting visualization
```

### GeneCards (enhanced data)

```
Task: Navigate to https://www.genecards.org/cgi-bin/carddisp.pl?gene={GENE}
1. Extract: Gene summary paragraph
2. Extract: Protein expression table (tissues)
3. Extract: Pathways & interactions section
4. Extract: Disorders associated section
5. Return all as structured JSON
```

### TIMER2.0 (immune analysis, web-only)

```
Task: Go to http://timer.cistrome.org/
1. Select 'Gene' module
2. Enter gene symbol: {GENE}
3. Select cancer type: {CANCER}
4. Select immune cell types: all
5. Click Submit
6. Extract correlation coefficients and p-values from the result table
```

### HPA (Human Protein Atlas)

```
Task: Navigate to https://www.proteinatlas.org/{ENSEMBL_ID}-{GENE}/pathology
1. Extract cancer expression data table
2. Extract prognostic significance across cancer types
3. Extract immunohistochemistry images metadata
```

---

## Safety and Ethics

1. **Respect robots.txt**: Check before scraping any site
2. **Rate limiting**: Wait 2-5 seconds between page navigations
3. **No credential storage**: Never save login credentials to disk
4. **Public data only**: Do not circumvent paywalls or access restrictions
5. **Attribution**: Record the source URL and access date for every data extraction
6. **Minimize requests**: Cache extracted data in the project `data/` directory

---

## Integration with Research Recipes

When a recipe step fails due to API unavailability:

```
curl API call for [DATABASE] failed (404/no API).
Attempting browser-based extraction via browser-use...
```

The browser fallback should:
1. Try the browser approach
2. If browser-use is not installed, suggest installation
3. If the browser approach also fails, document what was attempted and move on

---

## Limitations

- Requires a display server or headless Chromium (may not work in minimal Docker containers)
- Slower than API calls (5-30 seconds per page vs <1 second for curl)
- Vision-based extraction may misread complex layouts
- Some sites actively block automation (detect and skip gracefully)
- Requires an LLM API key for the browser agent (uses GPT-4o by default)

Related Skills

zinc-database

from Zaoqu-Liu/ScienceClaw

Access ZINC (230M+ purchasable compounds). Search by ZINC ID/SMILES, similarity searches, 3D-ready structures for docking, analog discovery, for virtual screening and drug discovery.

zarr-python

from Zaoqu-Liu/ScienceClaw

Chunked N-D arrays for cloud storage. Compressed arrays, parallel I/O, S3/GCS integration, NumPy/Dask/Xarray compatible, for large-scale scientific computing pipelines.

Academic Writing

from Zaoqu-Liu/ScienceClaw

## Overview

scientific-visualization

from Zaoqu-Liu/ScienceClaw

## Overview

venue-templates

from Zaoqu-Liu/ScienceClaw

Access comprehensive LaTeX templates, formatting requirements, and submission guidelines for major scientific publication venues (Nature, Science, PLOS, IEEE, ACM), academic conferences (NeurIPS, ICML, CVPR, CHI), research posters, and grant proposals (NSF, NIH, DOE, DARPA). This skill should be used when preparing manuscripts for journal submission, conference papers, research posters, or grant proposals and need venue-specific formatting requirements and templates.

vaex

from Zaoqu-Liu/ScienceClaw

Use this skill for processing and analyzing large tabular datasets (billions of rows) that exceed available RAM. Vaex excels at out-of-core DataFrame operations, lazy evaluation, fast aggregations, efficient visualization of big data, and machine learning on large datasets. Apply when users need to work with large CSV/HDF5/Arrow/Parquet files, perform fast statistics on massive datasets, create visualizations of big data, or build ML pipelines that do not fit in memory.

uspto-database

from Zaoqu-Liu/ScienceClaw

Access USPTO APIs for patent/trademark searches, examination history (PEDS), assignments, citations, office actions, TSDR, for IP analysis and prior art searches.

uniprot-database

from Zaoqu-Liu/ScienceClaw

Direct REST API access to UniProt. Protein searches, FASTA retrieval, ID mapping, Swiss-Prot/TrEMBL. For Python workflows with multiple databases, prefer bioservices (unified interface to 40+ services). Use this for direct HTTP/REST work or UniProt-specific control.

umap-learn

from Zaoqu-Liu/ScienceClaw

UMAP dimensionality reduction. Fast nonlinear manifold learning for 2D/3D visualization, clustering preprocessing (HDBSCAN), supervised/parametric UMAP, for high-dimensional data.

treatment-plans

from Zaoqu-Liu/ScienceClaw

Generate concise (3-4 page), focused medical treatment plans in LaTeX/PDF format for all clinical specialties. Supports general medical treatment, rehabilitation therapy, mental health care, chronic disease management, perioperative care, and pain management. Includes SMART goal frameworks, evidence-based interventions with minimal text citations, regulatory compliance (HIPAA), and professional formatting. Prioritizes brevity and clinical actionability.

transformers

from Zaoqu-Liu/ScienceClaw

This skill should be used when working with pre-trained transformer models for natural language processing, computer vision, audio, or multimodal tasks. Use for text generation, classification, question answering, translation, summarization, image classification, object detection, speech recognition, and fine-tuning models on custom datasets.

torchdrug

from Zaoqu-Liu/ScienceClaw

PyTorch-native graph neural networks for molecules and proteins. Use when building custom GNN architectures for drug discovery, protein modeling, or knowledge graph reasoning. Best for custom model development, protein property prediction, retrosynthesis. For pre-trained models and diverse featurizers use deepchem; for benchmark datasets use pytdc.