cactus-cheminformatics-guide

PNNL cheminformatics LLM agent for molecular analysis

191 stars

Best use case

cactus-cheminformatics-guide is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

PNNL cheminformatics LLM agent for molecular analysis

Teams using cactus-cheminformatics-guide should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/cactus-cheminformatics-guide/SKILL.md --create-dirs "https://raw.githubusercontent.com/wentorai/research-plugins/main/skills/domains/chemistry/cactus-cheminformatics-guide/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/cactus-cheminformatics-guide/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How cactus-cheminformatics-guide Compares

Feature / Agentcactus-cheminformatics-guideStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

PNNL cheminformatics LLM agent for molecular analysis

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# CACTUS Cheminformatics Agent Guide

## Overview

CACTUS is a cheminformatics LLM agent developed at Pacific Northwest National Laboratory (PNNL) that provides AI-assisted molecular analysis, property prediction, and chemical reasoning. It wraps RDKit, molecular databases, and ML models behind a conversational interface, enabling researchers to query molecular properties, perform similarity searches, and run cheminformatics workflows using natural language.

## Usage

```python
from cactus import ChemAgent

agent = ChemAgent(llm_provider="anthropic")

# Natural language chemistry queries
result = agent.ask(
    "What is the molecular weight and LogP of aspirin? "
    "Is it drug-like by Lipinski's rules?"
)
print(result.answer)
# Aspirin (CC(=O)Oc1ccccc1C(=O)O):
# MW: 180.16, LogP: 1.24
# Lipinski: PASS (MW<500, LogP<5, HBD=1≤5, HBA=4≤10)

# Molecular property calculation
props = agent.calculate_properties(
    smiles="CC(=O)Oc1ccccc1C(=O)O",
    properties=["mw", "logp", "tpsa", "hbd", "hba", "rotatable"],
)
print(props)
```

## Similarity Search

```python
# Find similar molecules
similar = agent.similarity_search(
    query_smiles="CC(=O)Oc1ccccc1C(=O)O",  # Aspirin
    database="chembl",
    threshold=0.7,  # Tanimoto similarity
    max_results=10,
)

for mol in similar:
    print(f"{mol.name}: {mol.smiles} "
          f"(similarity: {mol.tanimoto:.3f})")
```

## Substructure Analysis

```python
# Substructure search
matches = agent.substructure_search(
    pattern="c1ccccc1C(=O)O",  # Benzoic acid motif
    database="drugbank",
    max_results=20,
)

# Functional group identification
groups = agent.identify_functional_groups(
    smiles="CC(=O)Oc1ccccc1C(=O)O"
)
# ["ester", "carboxylic_acid", "aromatic_ring"]
```

## Use Cases

1. **Molecular analysis**: Property calculation via natural language
2. **Drug screening**: Lipinski/Veber rule checking
3. **Similarity search**: Find analogs in chemical databases
4. **Structure analysis**: Substructure and functional group ID
5. **Chemical education**: Interactive chemistry exploration

## References

- [CACTUS GitHub](https://github.com/pnnl/cactus)
- [RDKit](https://www.rdkit.org/)
- [PNNL](https://www.pnnl.gov/)