llm-scientific-discovery-guide

Survey of LLM agents for biomedical scientific discovery

191 stars

Best use case

llm-scientific-discovery-guide is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Survey of LLM agents for biomedical scientific discovery

Teams using llm-scientific-discovery-guide should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/llm-scientific-discovery-guide/SKILL.md --create-dirs "https://raw.githubusercontent.com/wentorai/research-plugins/main/skills/research/deep-research/llm-scientific-discovery-guide/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/llm-scientific-discovery-guide/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How llm-scientific-discovery-guide Compares

Feature / Agentllm-scientific-discovery-guideStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Survey of LLM agents for biomedical scientific discovery

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# LLM Agents for Scientific Discovery Guide

## Overview

A curated survey of how LLM-based agents are being applied to scientific discovery, with a focus on biomedical research. Covers hypothesis generation, experiment design, lab automation, literature synthesis, and multi-agent scientific collaboration. Tracks papers, tools, and frameworks across the spectrum from fully autonomous to human-in-the-loop systems.

## Landscape

```
LLM Agents for Scientific Discovery
├── Hypothesis Generation
│   ├── Literature-based (gap identification)
│   ├── Data-driven (pattern discovery)
│   └── Analogy-based (cross-domain transfer)
├── Experiment Design
│   ├── Protocol generation
│   ├── Parameter optimization
│   └── Control selection
├── Lab Automation
│   ├── Robot control (self-driving labs)
│   ├── Equipment programming
│   └── Data collection orchestration
├── Analysis & Interpretation
│   ├── Statistical analysis
│   ├── Visualization
│   └── Result interpretation
└── Communication
    ├── Paper writing
    ├── Presentation generation
    └── Peer review simulation
```

## Key Systems

| System | Domain | Capability |
|--------|--------|-----------|
| **AI Scientist** | ML/AI | Full paper generation pipeline |
| **ChemCrow** | Chemistry | Tool-augmented chemical reasoning |
| **Coscientist** | Chemistry | Autonomous experiment execution |
| **BioPlanner** | Biology | Experiment protocol generation |
| **MedAgent** | Medicine | Clinical trial analysis |
| **GenAgent** | Genomics | Gene expression analysis |
| **DrugAgent** | Pharma | Drug interaction prediction |

## Hypothesis Generation

```python
# LLM-based hypothesis generation pattern
from scientific_agent import HypothesisGenerator

generator = HypothesisGenerator(
    llm_provider="anthropic",
    knowledge_sources=["pubmed", "openalex"],
)

hypotheses = generator.generate(
    domain="oncology",
    context="Recent findings show that gut microbiome "
            "composition correlates with immunotherapy response",
    constraints=[
        "Must be testable in vitro",
        "Should involve specific bacterial species",
        "Must have measurable endpoints",
    ],
    num_hypotheses=5,
)

for h in hypotheses:
    print(f"\nHypothesis: {h.statement}")
    print(f"  Rationale: {h.rationale}")
    print(f"  Supporting evidence: {len(h.evidence)} papers")
    print(f"  Novelty score: {h.novelty_score:.2f}")
    print(f"  Feasibility: {h.feasibility}")
```

## Self-Driving Lab Integration

```python
# Agent controlling automated experiments
from scientific_agent import LabAgent

agent = LabAgent(
    llm_provider="anthropic",
    equipment=["plate_reader", "liquid_handler", "incubator"],
    safety_constraints=["bsl2", "max_volume_1ml"],
)

# Design and run experiment
result = agent.run_experiment(
    objective="Determine IC50 of compound X against cell line Y",
    protocol_type="dose_response",
    parameters={
        "compound": "Compound_X",
        "cell_line": "HeLa",
        "concentrations": "serial_dilution",
        "replicates": 3,
        "readout": "cell_viability",
    },
)

print(f"IC50: {result.ic50:.2f} uM")
print(f"R-squared: {result.r_squared:.3f}")
result.plot_dose_response("dose_response.pdf")
```

## Multi-Agent Scientific Collaboration

```python
# Agents with different scientific roles
from scientific_agent import ScientificTeam

team = ScientificTeam(
    agents={
        "PI": {"role": "research_director",
               "expertise": "oncology"},
        "Experimentalist": {"role": "experiment_design",
                           "expertise": "cell_biology"},
        "Analyst": {"role": "data_analysis",
                   "expertise": "biostatistics"},
        "Writer": {"role": "manuscript_writing",
                  "expertise": "scientific_communication"},
    },
)

# Collaborative research cycle
project = team.start_project(
    title="Microbiome-immunotherapy interaction study",
    timeline_weeks=12,
)

# Agents collaborate: PI directs → Experimentalist designs →
# Analyst processes → Writer documents
```

## Reading Roadmap

```markdown
### Foundational Papers
1. "The AI Scientist" (Lu et al., 2024) — Fully automated ML research
2. "ChemCrow" (Bran et al., 2023) — Chemistry tool-use agent
3. "Coscientist" (Boiko et al., 2023) — Autonomous chemical research
4. "BioPlanner" (Biswas et al., 2024) — Biology protocol generation

### Surveys
5. "Scientific Discovery in the Age of AI" (Wang et al., 2023)
6. "Foundation Models for Science" (Bommasani et al., 2022)
7. "LLM Agents: A Survey" (multiple, 2024)

### Ethics & Limitations
8. "Dual-use concerns of AI in biology" (Sandbrink, 2023)
9. "Can LLMs Generate Novel Research Ideas?" (Si et al., 2024)
```

## Use Cases

1. **Literature mining**: Automated hypothesis from research gaps
2. **Experiment automation**: Self-driving lab orchestration
3. **Drug discovery**: Multi-agent screening and optimization
4. **Research planning**: Protocol and proposal generation
5. **Scientific writing**: Paper drafting with verified claims

## References

- [Awesome-LLM-Agents-Scientific-Discovery](https://github.com/zjlrock777/Awesome-LLM-Agents-Scientific-Discovery)
- [The AI Scientist](https://arxiv.org/abs/2408.06292)
- [ChemCrow](https://arxiv.org/abs/2304.05376)