content-proofreading

An academic proofreading skill for Chinese/English manuscripts, triggered when you need automated checks for spelling, grammar, terminology consistency, and formatting before submission.

53 stars

byaipoch

View on GitHub Installation ↓

Best use case

content-proofreading is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

An academic proofreading skill for Chinese/English manuscripts, triggered when you need automated checks for spelling, grammar, terminology consistency, and formatting before submission.

Teams using content-proofreading should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/content-proofreading/SKILL.md --create-dirs "https://raw.githubusercontent.com/aipoch/medical-research-skills/main/scientific-skills/Other/content-proofreading/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/content-proofreading/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How content-proofreading Compares

Feature / Agent	content-proofreading	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

An academic proofreading skill for Chinese/English manuscripts, triggered when you need automated checks for spelling, grammar, terminology consistency, and formatting before submission.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agent for YouTube Script Writing

Find AI agent skills for YouTube script writing, video research, content outlining, and repeatable channel production workflows.

SKILL.md Source

> **Source**: [https://github.com/aipoch/medical-research-skills](https://github.com/aipoch/medical-research-skills)

## When to Use

- You are preparing an academic paper for journal/conference submission and need a final language + formatting pass.
- You have bilingual (Chinese/English) content and want consistent punctuation, wording, and style across both languages.
- Your manuscript contains domain terminology (e.g., life sciences) and you need consistent Chinese–English term mapping and abbreviation rules.
- You need to validate references, numbers/units, and heading levels against a required style (APA/MLA/GB/T 7714).
- You want a shareable report (HTML or Markdown annotations) with precise error locations and revision suggestions.

## Key Features

- **English checks**
  - Spelling (including US/UK variants)
  - Grammar (agreement, tense, articles, clause structure)
  - Punctuation conventions (US/UK)
  - Style suggestions (redundancy detection, passive voice optimization)

- **Chinese checks**
  - Typo/misused character detection (dictionary-based)
  - Grammar and collocation checks
  - Chinese vs. English punctuation normalization
  - Academic expression optimization suggestions

- **Terminology consistency**
  - Domain terminology database (life sciences by default)
  - Bidirectional Chinese–English correspondence checks
  - Abbreviation rules (require full form on first occurrence)
  - Synonym unification to preferred standard terms

- **Formatting checks**
  - Reference style validation (APA/MLA/GB/T 7714, etc.)
  - Number and unit normalization
  - Heading level consistency
  - Abbreviation consistency across the document

- **Reporting**
  - HTML interactive report or Markdown annotations
  - Precise error localization
  - Actionable revision suggestions

## Dependencies

- **Python**: `>= 3.8`

- **Python packages** (install via `pip install -r requirements.txt`)
  - `languagetool-python` (version: see `requirements.txt`) — English grammar checking
  - `opencc` (version: see `requirements.txt`) — Traditional/Simplified Chinese conversion
  - `jieba` (version: see `requirements.txt`) — Chinese tokenization
  - `pyenchant` (version: see `requirements.txt`) — spelling checks
  - `markdown` (version: see `requirements.txt`) — Markdown rendering
  - `python-docx` (version: see `requirements.txt`) — `.docx` reading
  - `docx2pdf` (version: see `requirements.txt`) — Word-to-PDF conversion

## Example Usage

### 1) Install

```bash
python -m venv .venv
source .venv/bin/activate  # Windows: .venv\Scripts\activate

pip install -r requirements.txt
```

### 2) Run (basic)

```bash
python scripts/init_run.py --input <paper_file_path> --output <output_path>
```

### 3) Run (advanced)

```bash
python scripts/init_run.py \
  --input paper.md \
  --output report.html \
  --lang en \
  --style apa \
  --terminology biology \
  --format html
```

### 4) CLI parameters

| Parameter | Description | Default |
|---|---|---|
| `--input` | Input file path | Required |
| `--output` | Output report path | Generates an HTML report by default |
| `--lang` | Language to check (`en` / `zh` / `both`) | `both` |
| `--style` | Reference style (`apa` / `mla` / `gb`) | `apa` |
| `--terminology` | Domain terminology set | `biology` |
| `--format` | Output format (`html` / `markdown`) | `html` |
| `--no-pdf` | Skip PDF generation during Word→PDF conversion | `false` |

### 5) Use as a Python module (end-to-end)

```python
from scripts.english_checker import EnglishChecker
from scripts.chinese_checker import ChineseChecker
from scripts.terminology_manager import TerminologyManager
from scripts.annotation_generator import AnnotationGenerator

text = """
Messenger RNA (mRNA) is transcribed in the nucleus.
"""

en_checker = EnglishChecker()
zh_checker = ChineseChecker()
term_manager = TerminologyManager(domain="biology")

results = []
results.extend(en_checker.check(text))
results.extend(zh_checker.check(text))
results.extend(term_manager.check(text))

generator = AnnotationGenerator(output_format="html")
report = generator.generate(results)

with open("report.html", "w", encoding="utf-8") as f:
    f.write(report)
```

## Implementation Details

### Architecture / Core Modules

- `english_checker.py`
  - Core engine for English spelling/grammar/style checks.
  - Designed to be rule-extensible (add or register new rule sets).

- `chinese_checker.py`
  - Core engine for Chinese typo/grammar/style checks.
  - Includes a library of common academic writing error patterns.

- `terminology_manager.py`
  - Terminology database management (import/export/query/update).
  - Performs term consistency checks, bilingual mapping validation, and abbreviation policy checks.

- `annotation_generator.py`
  - Converts detected issues into a visual report (HTML) or annotated Markdown.
  - Ensures issues include **location**, **type**, and **suggested fix**.

- `word_converter.py`
  - Extracts text from `.docx`.
  - Optionally converts Word to PDF (can be disabled via `--no-pdf`).

### Terminology database format (JSON)

Organized by domain; each entry can include bilingual forms and abbreviation metadata:

```json
{
  "biology": {
    "cell": {
      "en": "cell",
      "abbrev": null,
      "full_form": null
    },
    "mrna": {
      "en": "mRNA",
      "abbrev": "mRNA",
      "full_form": "messenger RNA"
    }
  }
}
```

**Checking logic (typical):**
- If an abbreviation (e.g., `mRNA`) appears, verify the **full form** appears at first mention (e.g., `messenger RNA (mRNA)`).
- If both Chinese and English terms appear, verify they match the configured mapping for the selected domain.
- If synonyms are detected, prefer the standardized term defined in the database.

### Rule database format (JSON)

Rules are grouped by language and category:

```json
{
  "english": {
    "spelling": [],
    "grammar": [],
    "style": []
  },
  "format": {
    "references": [],
    "numbers": [],
    "units": []
  }
}
```

**How rules are applied (high level):**
- Load rule sets by `--lang` and `--style`.
- Run language-specific checks (English/Chinese) and formatting checks.
- Merge results into a unified issue list.
- Render issues into the selected output format (`html` / `markdown`) with location-aware annotations.

### Extensibility

- **Add new rules**
  1. Create a rule file under `assets/rules/`.
  2. Implement rules following the project’s rule template.
  3. Register the rule set in the rule index.
  4. Run tests to validate precision/recall and avoid false positives.

- **Add new terminology sets**
  1. Create a terminology JSON under `assets/terminology/`.
  2. Follow the domain structure shown above.
  3. Register the new domain in the terminology index so it can be selected via `--terminology`.

Related Skills

visual-content-desc

from aipoch/medical-research-skills

Generates detailed text descriptions of medical images and charts for.

skill-auditor

from aipoch/medical-research-skills

A comprehensive auditor for any agent skill — including Manus, OpenClaw/ClawHub, Claude, LobeHub, or custom SKILL.md-based skills. Use this skill whenever a user wants to evaluate, audit, review, score, or quality-check an agent skill before publishing, updating, or deploying. Covers two hard veto gates (structural redlines + research integrity redlines), static quality scoring across 25 criteria (ISO 25010 + OpenSSF + Agent), dynamic test input generation, multi-mode execution testing, multi-layer output evaluation with five specialized category rubrics (Evidence Insight / Protocol Design / Data Analysis / Academic Writing / Other), a Research Veto that applies to all four research categories, human eval viewer generation, actionable P0/P1/P2 optimization recommendations, and automatic skill improvement that outputs a polished, production-ready SKILL.md. Also use whenever a user says "audit my skill", "evaluate my skill", "improve my skill", or wants a corrected version after evaluation.

two-sample-mr-research-planner

from aipoch/medical-research-skills

Generates complete two-sample Mendelian randomization (MR) research designs from a user-provided research direction. Use when users want to design, plan, or build a study using two-sample MR to test causal relationships. Triggers:"design a two-sample MR study", "build a publishable MR paper", "test whether this biomarker causally affects this disease", "generate Lite/Standard/Advanced MR plans", "screen multiple exposures with MR", "bidirectional MR design", "causal inference using GWAS summary statistics", or "I want to study X and Y using MR". Always outputs four workload configurations (Lite / Standard / Advanced / Publication+) with a recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, and publication upgrade path.

research-proposal-generator

from aipoch/medical-research-skills

Generates a comprehensive research proposal design based on input literature, including hypothesis, mechanism verification, and budget. Use when the user wants to design a research project from a paper.

research-grants

from aipoch/medical-research-skills

Write competitive research proposals for NSF, NIH, DOE, DARPA, and Taiwan's NSTC when you need agency-compliant narratives, budgets, and review-criteria alignment for a specific solicitation/FOA/BAA.

protocol-standardization

from aipoch/medical-research-skills

Standardize fragmented experimental steps into reproducible protocol documents when you need method organization, lab SOP drafting, or cross-operator reproducibility; missing parameters must be explicitly marked as "To be supplemented/Not provided".

prospero-registration-helper

from aipoch/medical-research-skills

Assists researchers in generating PROSPERO registration content for meta-analyses from a title and optional protocol. Use when the user wants to draft a PROSPERO registration form.

non-tumor-ml-research-planner

from aipoch/medical-research-skills

Generates complete non-tumor biomedical machine learning research designs from a user-provided research direction. Always use this skill when users want to plan bioinformatics + ML papers for non-cancer diseases (metabolic, cardiovascular, kidney, inflammatory, autoimmune, infectious, neurological, endocrine, wound healing, chronic multifactor), design diagnostic biomarker studies, combine GEO datasets with feature selection and ML modeling, or generate Lite/Standard/Advanced/Publication+ workload plans. Trigger for:"non-tumor ML study", "bioinformatics paper outside oncology", "key genes and diagnostic model for a disease", "pyroptosis/ferroptosis/senescence/autophagy + disease", "GEO datasets + machine learning", "RF + LASSO diagnostic model", "DEG + feature selection + validation", "immune infiltration + biomarker", "non-cancer biomarker paper". Trigger even for casual phrasings like "I want to study X using machine learning", "help me design a non-tumor bioinformatics paper", or "how do I build a diagnostic model for disease Y".

network-tox-docking-research-planner

from aipoch/medical-research-skills

Generates complete network toxicology + molecular docking research designs from a user-provided toxicant and disease/phenotype. Always use this skill when users want to investigate how an environmental toxicant, endocrine disruptor, heavy metal, food contaminant, pharmaceutical residue, or consumer product chemical may contribute to a disease through shared molecular targets, hub genes, pathways, and docking evidence. Trigger for:"network toxicology study", "toxicology mechanism paper", "target prediction + PPI + docking", "environmental pollutant and disease mechanism", "hub genes and docking for toxicant", "Lite/Standard/Advanced toxicology plan", "CTD + SwissTargetPrediction + GeneCards + STRING", "CB-Dock2 docking study", "triclosan/BPA/cadmium/PFAS + disease". Also triggers for Chinese phrasings:"网络毒理学研究设计"、"毒物机制论文"、"靶点预测+PPI+对接"、"环境污染物与疾病机制". Trigger even for casual phrasings like "I want to study how chemical X affects disease Y" or "help me design a toxicology paper". Always output four workload configurations (Lite / Standard / Advanced / Publication+) with a recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, and publication upgrade path.

meta-protocol-writer

from aipoch/medical-research-skills

Generates a PROSPERO-compliant Meta-analysis protocol based on Title and PICOS. Use when the user wants to write a protocol for a systematic review or meta-analysis.

hypothesis-generation

from aipoch/medical-research-skills

Structured scientific hypothesis formulation from observations; use when you have experimental observations or preliminary data and need testable hypotheses with predictions, mechanisms, and validation experiments.

hypogenic

from aipoch/medical-research-skills

Automated LLM-driven hypothesis generation and testing for tabular datasets; use when you need systematic exploration of empirical patterns (e.g., fraud detection, content analysis) and want to combine literature insights with data-driven hypothesis evaluation.