chemical-structure-converter

Convert between IUPAC names, SMILES strings, molecular formulas, and common names for chemical compounds. Supports SMILES validation, batch processing, structure standardization, and cheminformatics database preparation for drug discovery workflows.

53 stars

byaipoch

View on GitHub Installation ↓

Best use case

chemical-structure-converter is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using chemical-structure-converter should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/chemical-structure-converter/SKILL.md --create-dirs "https://raw.githubusercontent.com/aipoch/medical-research-skills/main/scientific-skills/Other/chemical-structure-converter/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/chemical-structure-converter/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How chemical-structure-converter Compares

Feature / Agent	chemical-structure-converter	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

> **Source**: [https://github.com/aipoch/medical-research-skills](https://github.com/aipoch/medical-research-skills)

# Chemical Structure Converter

Interconvert between different chemical structure representations including IUPAC names, SMILES strings, molecular formulas, and common names. Essential for cheminformatics workflows, database standardization, and compound registration in drug discovery and chemical research.

**Key Capabilities:**
- **Multi-Format Conversion**: IUPAC names, SMILES, InChI, molecular formulas
- **SMILES Validation**: Validate SMILES syntax for structural correctness
- **Batch Processing**: Process multiple compounds for database standardization
- **Identifier Lookup**: Retrieve all available identifiers for known compounds
- **Structure Standardization**: Normalize chemical representations for consistency

---

## Input Validation

This skill accepts: compound names (common or IUPAC), SMILES strings, or InChI identifiers. Batch input via CSV or plain text list is also supported.

If the request does not involve converting or validating chemical structure identifiers — for example, asking to predict biological activity, perform docking, or interpret spectra — do not proceed. Instead respond:
> "Chemical Structure Converter is designed to interconvert chemical identifiers (names, SMILES, formulas). Please provide a compound name or SMILES string. For other cheminformatics tasks, use a more appropriate tool."

---

## Quick Check

```bash
python -m py_compile scripts/main.py
python scripts/main.py --help
```

## Workflow

1. Confirm the input identifier type (name, SMILES, IUPAC) and desired output format.
2. Validate that the request matches the documented scope; stop if the task requires unsupported assumptions.
3. Run the script or apply the documented conversion path with only the inputs available.
4. Return a structured result separating assumptions, deliverables, risks, and unresolved items.
5. If execution fails or inputs are incomplete, switch to the fallback path and state exactly what blocked full completion.

**Fallback:** If no identifier is provided, respond: "No chemical identifier provided. Please supply a compound name (`--name`), SMILES string (`--smiles`), or IUPAC name (`--iupac`). Cannot convert without an input identifier."

---

## Core Capabilities

### 1. Multi-Format Conversion

```python
from scripts.main import ChemicalStructureConverter
converter = ChemicalStructureConverter()
data = converter.name_to_identifiers("aspirin")
# → IUPAC: 2-acetoxybenzoic acid, SMILES: CC(=O)Oc1ccccc1C(=O)O, Formula: C9H8O4, MW: 180.16
```

| From → To | Use Case |
|-----------|----------|
| **Name → SMILES** | Literature to database |
| **SMILES → IUPAC** | Machine to human readable |
| **IUPAC → SMILES** | Chemical registration |
| **SMILES → Formula** | Quick MW calculation |

### 2. SMILES Validation

```python
is_valid, message = converter.validate_smiles("CC(=O)Oc1ccccc1C(=O)O")
# → True, "Valid SMILES syntax"
```

| Check | Example Error |
|-------|---------------|
| **Parentheses** | `C(=O` — missing closing |
| **Ring closures** | `C1CC` — ring not closed |
| **Atom validity** | `@` — invalid character |

### 3. Batch Processing

```python
for compound in compound_list:
    data = converter.name_to_identifiers(compound)
    if not data:
        print(f"Warning: '{compound}' not found in database")
```

---

## CLI Usage

```text
# Convert by compound name
python scripts/main.py --name aspirin

# Convert SMILES to IUPAC
python scripts/main.py --smiles "CC(=O)Oc1ccccc1C(=O)O"

# Validate SMILES
python scripts/main.py --smiles "CCO" --validate

# List all compounds
python scripts/main.py --list
```

---

## Parameters

| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| `--name`, `-n` | string | No | Compound name |
| `--smiles`, `-s` | string | No | SMILES string |
| `--iupac`, `-i` | string | No | IUPAC name |
| `--validate` | flag | No | Validate SMILES syntax |
| `--list`, `-l` | flag | No | List available compounds |

---

## Output Requirements

Every final response must make these explicit:

- Objective or requested deliverable
- Inputs used (identifier type and value) and assumptions introduced
- Conversion method applied
- Core result: all available identifiers (SMILES, IUPAC, formula, MW)
- Constraints and risks (local database limited; novel compounds may not be found)
- Unresolved items and next-step checks (validate against PubChem for critical work)

---

## Error Handling

- If no identifier is provided, list the required input options and request clarification.
- If a compound is not found in the local database, flag it and provide direct lookup URLs: `https://pubchem.ncbi.nlm.nih.gov/compound/{compound_name}` and `https://www.chemspider.com/Search.aspx?q={compound_name}`. For programmatic lookup, query the PubChem REST API: `https://pubchem.ncbi.nlm.nih.gov/rest/pug/compound/name/{name}/JSON`. The script should automatically query this endpoint when a compound is not found locally.
- If `scripts/main.py` fails, report the failure point and provide manual fallback guidance.
- Do not fabricate SMILES strings, molecular weights, or identifiers.
- **Batch mode:** Include a summary line: `X/N compounds converted successfully, Y failed (list failed compound names).`
- **Database versioning:** The local compound database version is tracked in `DB_VERSION` in `scripts/main.py`. To add compounds, update the `COMPOUND_DB` dict and increment `DB_VERSION`.

---

## Common Pitfalls

- **Ambiguous names**: Use CAS numbers or specific synonyms for unambiguous lookup
- **Stereochemistry omitted**: Specify @/@@ in SMILES for chiral compounds
- **Hydrates vs anhydrous**: Always specify form (e.g., "caffeine anhydrous")
- **Duplicate entries**: Deduplicate by canonical SMILES when building databases
- **Character encoding**: Use UTF-8 for IUPAC names with special characters

---

## SMILES Quick Reference

- `C` = aliphatic carbon, `c` = aromatic carbon
- `=` = double bond, `#` = triple bond
- `()` = branching, `[]` = explicit valence/charge
- `@` = anticlockwise (S), `@@` = clockwise (R)

---

## References

- PubChem: https://pubchem.ncbi.nlm.nih.gov
- ChemSpider: http://www.chemspider.com
- SMILES Specification: http://opensmiles.org
- RDKit Documentation: https://www.rdkit.org/docs/

**Known Limitation:** Local database contains common compounds only. Integrate PubChem API for production use.

Related Skills

medical-unit-converter

from aipoch/medical-research-skills

Convert medical laboratory values between units (mg/dL to mmol/L, etc.) with formula transparency and clinical reference ranges. Supports glucose, cholesterol, creatinine, and hemoglobin conversions.

chemical-storage-sorter

from aipoch/medical-research-skills

Sort laboratory chemicals into safe storage groups by hazard classification (acids, bases, oxidizers, flammables, toxics). Identifies incompatible pairs, generates storage plans with warnings, and supports OSHA/NFPA compliance for lab safety.

unstructured-medical-text-miner

from aipoch/medical-research-skills

Mine unstructured clinical text from MIMIC-IV to extract diagnostic logic.

upset-plot-converter

from aipoch/medical-research-skills

Convert complex Venn diagrams with more than 4 sets to clearer Upset.

toxicity-structure-alert

from aipoch/medical-research-skills

Analyze data with `toxicity-structure-alert` using a reproducible workflow, explicit validation, and structured outputs for review-ready interpretation.

gene-structure-mapper

from aipoch/medical-research-skills

Use gene structure mapper for data analysis workflows that need structured execution, explicit assumptions, and clear output boundaries.

biopython-structure

from aipoch/medical-research-skills

Use Bio.PDB to parse and analyze protein structures (PDB/mmCIF) for structural bioinformatics tasks; use when you need structure parsing, geometry calculations, or structural comparison/superposition.

results-section-structurer

from aipoch/medical-research-skills

Organizes biomedical figures, analyses, and result blocks into a clear Results section structure with disciplined narrative ordering and evidence-aware presentation.

latex-manuscript-format-converter

from aipoch/medical-research-skills

Converts existing manuscript content into LaTeX format aligned with a target journal, conference, or template while preserving manuscript meaning and structural integrity.

skill-auditor

from aipoch/medical-research-skills

A comprehensive auditor for any agent skill — including Manus, OpenClaw/ClawHub, Claude, LobeHub, or custom SKILL.md-based skills. Use this skill whenever a user wants to evaluate, audit, review, score, or quality-check an agent skill before publishing, updating, or deploying. Covers two hard veto gates (structural redlines + research integrity redlines), static quality scoring across 25 criteria (ISO 25010 + OpenSSF + Agent), dynamic test input generation, multi-mode execution testing, multi-layer output evaluation with five specialized category rubrics (Evidence Insight / Protocol Design / Data Analysis / Academic Writing / Other), a Research Veto that applies to all four research categories, human eval viewer generation, actionable P0/P1/P2 optimization recommendations, and automatic skill improvement that outputs a polished, production-ready SKILL.md. Also use whenever a user says "audit my skill", "evaluate my skill", "improve my skill", or wants a corrected version after evaluation.

two-sample-mr-research-planner

from aipoch/medical-research-skills

Generates complete two-sample Mendelian randomization (MR) research designs from a user-provided research direction. Use when users want to design, plan, or build a study using two-sample MR to test causal relationships. Triggers:"design a two-sample MR study", "build a publishable MR paper", "test whether this biomarker causally affects this disease", "generate Lite/Standard/Advanced MR plans", "screen multiple exposures with MR", "bidirectional MR design", "causal inference using GWAS summary statistics", or "I want to study X and Y using MR". Always outputs four workload configurations (Lite / Standard / Advanced / Publication+) with a recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, and publication upgrade path.

research-proposal-generator

from aipoch/medical-research-skills

Generates a comprehensive research proposal design based on input literature, including hypothesis, mechanism verification, and budget. Use when the user wants to design a research project from a paper.