protocol-extraction-from-pdf

Extract laboratory protocols from PDF documents using Thoth-Plan to convert experimental procedures into structured text.

Best use case

protocol-extraction-from-pdf is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Extract laboratory protocols from PDF documents using Thoth-Plan to convert experimental procedures into structured text.

Teams using protocol-extraction-from-pdf should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/protocol-extraction-from-pdf/SKILL.md --create-dirs "https://raw.githubusercontent.com/SpectrAI-Initiative/InnoClaw/main/.claude/skills/protocol-extraction-from-pdf/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/protocol-extraction-from-pdf/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How protocol-extraction-from-pdf Compares

Feature / Agentprotocol-extraction-from-pdfStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Extract laboratory protocols from PDF documents using Thoth-Plan to convert experimental procedures into structured text.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Protocol Extraction from PDF

## Usage

```python
import asyncio
import json
from contextlib import AsyncExitStack
from mcp.client.streamable_http import streamablehttp_client
from mcp import ClientSession

class ThothClient:
    def __init__(self, server_url: str):
        self.server_url = server_url
        self.session = None

    async def connect(self):
        try:
            self.transport = streamablehttp_client(url=self.server_url, sse_read_timeout=60 * 10)
            self._stack = AsyncExitStack()
            await self._stack.__aenter__()
            self.read, self.write, self.get_session_id = await self._stack.enter_async_context(self.transport)
            self.session_ctx = ClientSession(self.read, self.write)
            self.session = await self._stack.enter_async_context(self.session_ctx)
            await self.session.initialize()
            return True
        except Exception as e:
            return False

    async def disconnect(self):
        """Disconnect from server"""
        try:
            if hasattr(self, '_stack'):
                await self._stack.aclose()
            print("✓ already disconnect")
        except Exception as e:
            print(f"✗ disconnect error: {e}")
    def parse_result(self, result):
        try:
            if hasattr(result, 'content') and result.content:
                content = result.content[0]
                if hasattr(content, 'text'):
                    return json.loads(content.text)
            return str(result)
        except:
            try:
                return result.content[0].text
            except:
                return {"error": "parse error", "raw": str(result)}

## Initialize and use
client = ThothClient("https://scp.intern-ai.org.cn/api/v1/mcp/19/Thoth-Plan")
await client.connect()

# PDF URL (must be publicly accessible)
pdf_url = "https://example.com/protocol.pdf"

result = await client.session.call_tool("extract_protocol_from_pdf", arguments={"pdf_url": pdf_url})
protocol = client.parse_result(result)
print("Extracted Protocol:")
print(protocol)

await client.disconnect()
```

### Tool: `extract_protocol_from_pdf`
- Args: `pdf_url` (str) - Public URL to PDF file
- Returns: Structured protocol text with numbered steps

### Use Cases
- Lab protocol digitization, SOP extraction, experimental workflow documentation

Related Skills

protocol-to-executable-json

370
from SpectrAI-Initiative/InnoClaw

Convert laboratory protocols to executable JSON format using Thoth-OP for automated lab equipment control.

protocol-generation-from-description

370
from SpectrAI-Initiative/InnoClaw

Generate detailed laboratory protocols from natural language descriptions using AI, producing step-by-step experimental procedures ready for lab execution.

lab_protocol_from_literature

370
from SpectrAI-Initiative/InnoClaw

Lab Protocol from Literature - Extract and generate lab protocol: search PubMed, extract protocol from paper, and generate executable protocol. Use this skill for lab science tasks involving pubmed search extract protocol from pdf protocol generation generate executable json. Combines 4 tools from 2 SCP server(s).

wind-site-assessment

370
from SpectrAI-Initiative/InnoClaw

Assess wind energy potential and perform site analysis using atmospheric science calculations.

web_literature_mining

370
from SpectrAI-Initiative/InnoClaw

Scientific Literature Mining - Mine scientific literature: PubMed search, arXiv search, web search, and Tavily deep search. Use this skill for scientific informatics tasks involving pubmed search search literature search web tavily search. Combines 4 tools from 2 SCP server(s).

virus_genomics

370
from SpectrAI-Initiative/InnoClaw

Virus Genomics Analysis - Analyze virus genomics: NCBI virus dataset, annotation, taxonomy, and literature search. Use this skill for virology tasks involving get virus dataset report get virus annotation report get taxonomy search literature. Combines 4 tools from 2 SCP server(s).

virtual_screening

370
from SpectrAI-Initiative/InnoClaw

Virtual Screening Pipeline - Virtual screening: search PubChem by substructure, compute similarity, filter by drug-likeness, and predict binding affinity. Use this skill for drug discovery tasks involving search pubchem by smiles calculate smiles similarity calculate mol drug chemistry boltz binding affinity. Combines 4 tools from 3 SCP server(s).

variant_pathogenicity

370
from SpectrAI-Initiative/InnoClaw

Variant Pathogenicity Assessment - Assess variant pathogenicity: Ensembl VEP prediction, ClinVar lookup, variation details, and gene phenotype associations. Use this skill for clinical genetics tasks involving get vep hgvs clinvar search get variation get phenotype gene. Combines 4 tools from 2 SCP server(s).

variant-population-frequency

370
from SpectrAI-Initiative/InnoClaw

Query gnomAD for variant allele frequency across populations. Uses FAVOR to convert rsID→variant_id first, then queries gnomAD.

variant-pharmacogenomics

370
from SpectrAI-Initiative/InnoClaw

Query PharmGKB (clinPGx) for pharmacogenomic clinical annotations — how a variant affects drug response, dosing, and adverse reactions.

variant-gwas-associations

370
from SpectrAI-Initiative/InnoClaw

Query EBI GWAS Catalog for GWAS statistical associations (p-value, effect size, risk allele) between a variant and traits/diseases.

variant-genomic-location

370
from SpectrAI-Initiative/InnoClaw

Query dbSNP + NCBI Gene to get variant genomic position (chromosome, coordinates, ref/alt alleles, mutation type) and associated gene coordinates.