text-format-organizer

A local text formatting organizer for biomedical/academic writing; use it when you need to clean whitespace/line endings while preserving Markdown structures or when normalizing .docx/.md/.txt before submission or proofreading.

53 stars

byaipoch

View on GitHub Installation ↓

Best use case

text-format-organizer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using text-format-organizer should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/text-format-organizer/SKILL.md --create-dirs "https://raw.githubusercontent.com/aipoch/medical-research-skills/main/scientific-skills/Other/text-format-organizer/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/text-format-organizer/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How text-format-organizer Compares

Feature / Agent	text-format-organizer	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

> **Source**: [https://github.com/aipoch/medical-research-skills](https://github.com/aipoch/medical-research-skills)

## Validation Shortcut

Run this minimal command first to verify the supported execution path:

```bash
python scripts/init_run.py --help
```

## When to Use

- Cleaning biomedical manuscripts where extra blank lines, trailing spaces, or mixed line endings break journal templates.
- Normalizing Markdown notes (lists/tables/code blocks) before converting to PDF/Word.
- Formatting clinical research reports or protocol records exported from multiple editors (Windows/macOS/Linux).
- Pre-processing `.docx` drafts before running downstream proofreading/QA tools (e.g., `academic-proofreader`).
- Preparing theses/dissertations to enforce consistent indentation and whitespace rules across chapters.

## Key Features

- **Intelligent cleaning**
- Removes redundant empty lines while keeping paragraph boundaries.
- Strips trailing whitespace while preserving leading indentation.
- Unifies line endings (`unix`, `windows`, `mac`).
- Converts tabs to spaces (configurable indentation size).
- **Structure protection**
- Preserves Markdown list structures (`-`, `*`, `1.`).
- Keeps fenced code blocks (``` ... ```) unchanged.
- Preserves Markdown table formatting.
- **Multi-format I/O**
- Supports `.txt`, `.md`, and `.docx` input/output.

## Dependencies

- `python >= 3.8`
- `python-docx >= 1.0.0`

## Example Usage

### 1) Format a Markdown or text file

```bash
python scripts/init_run.py --input input.md --output output.md
```

### 2) Format a Word document

```bash
python scripts/init_run.py -i paper.docx -o paper_clean.docx
```

### 3) Preview changes without writing output

```bash
python scripts/init_run.py -i input.md --preview
```

### 4) Programmatic usage (core module)

```python
from scripts.text_formatter import TextFormatter, FormatOptions

text = "Line with trailing spaces \n\n\n- item 1\n\t- item 2\n"
options = FormatOptions(
line_ending="unix",
indent="spaces",
indent_size=4,
)

formatter = TextFormatter(options=options)
formatted = formatter.format(text)
print(formatted)
```

### 5) Workflow with an academic proofreading tool

```bash
# Step 1: Format organization
python scripts/init_run.py -i paper.docx -o paper_clean.docx

# Step 2: Content/format checking (separate project)
cd ../academic-proofreader
python scripts/init_run.py -i paper_clean.docx
```

## Implementation Details

### CLI parameters

| Parameter | Description | Default |
|---|---|---|
| `--input` / `-i` | Input file path (`.txt` / `.md` / `.docx`) | Required |
| `--output` / `-o` | Output file path | Auto-generated |
| `--line-ending` | Line ending: `unix` / `windows` / `mac` | `unix` |
| `--indent` | Indentation type: `spaces` / `tabs` | `spaces` |
| `--indent-size` | Number of spaces per indent level | `4` |
| `--preview` | Preview mode (no output written) | `false` |
| `--docx-font` | Font used for Word output | `Times New Roman` |
| `--docx-size` | Font size used for Word output | `12` |

### Formatting rules (high level)

- **Whitespace normalization**
- Collapses excessive blank lines while preserving paragraph separation.
- Removes trailing spaces at line ends; does not remove leading indentation.
- **Line ending normalization**
- Converts all line endings to the selected target (`unix`/`windows`/`mac`).
- **Indentation normalization**
- Converts tab characters to spaces when `indent=spaces`, using `indent_size`.
- **Markdown-safe processing**
- Skips transformations inside fenced code blocks.
- Preserves list markers and table pipes/alignment to avoid structural breakage.
- **DOCX handling**
- Reads `.docx`, applies the same normalization at the text/paragraph level, then writes a new `.docx` using the configured font and size.

## When Not to Use

- Do not use this skill when the required source data, identifiers, files, or credentials are missing.
- Do not use this skill when the user asks for fabricated results, unsupported claims, or out-of-scope conclusions.
- Do not use this skill when a simpler direct answer is more appropriate than the documented workflow.

## Required Inputs

- A clearly specified task goal aligned with the documented scope.
- All required files, identifiers, parameters, or environment variables before execution.
- Any domain constraints, formatting requirements, and expected output destination if applicable.

## Recommended Workflow

1. Validate the request against the skill boundary and confirm all required inputs are present.
2. Select the documented execution path and prefer the simplest supported command or procedure.
3. Produce the expected output using the documented file format, schema, or narrative structure.
4. Run a final validation pass for completeness, consistency, and safety before returning the result.

## Output Contract

- Return a structured deliverable that is directly usable without reformatting.
- If a file is produced, prefer a deterministic output name such as `text_format_organizer_result.md` unless the skill documentation defines a better convention.
- Include a short validation summary describing what was checked, what assumptions were made, and any remaining limitations.

## Validation and Safety Rules

- Validate required inputs before execution and stop early when mandatory fields or files are missing.
- Do not fabricate measurements, references, findings, or conclusions that are not supported by the provided source material.
- Emit a clear warning when credentials, privacy constraints, safety boundaries, or unsupported requests affect the result.
- Keep the output safe, reproducible, and within the documented scope at all times.

## Failure Handling

- If validation fails, explain the exact missing field, file, or parameter and show the minimum fix required.
- If an external dependency or script fails, surface the command path, likely cause, and the next recovery step.
- If partial output is returned, label it clearly and identify which checks could not be completed.

## Quick Validation

Run this minimal verification path before full execution when possible:

```bash
python scripts/init_run.py --help
```

Expected output format:

```text
Result file: text_format_organizer_result.md
Validation summary: PASS/FAIL with brief notes
Assumptions: explicit list if any
```

## Deterministic Output Rules

- Use the same section order for every supported request of this skill.
- Keep output field names stable and do not rename documented keys across examples.
- If a value is unavailable, emit an explicit placeholder instead of omitting the field.

## Completion Checklist

- Confirm all required inputs were present and valid.
- Confirm the supported execution path completed without unresolved errors.
- Confirm the final deliverable matches the documented format exactly.
- Confirm assumptions, limitations, and warnings are surfaced explicitly.

Related Skills

vector-text-fixer

from aipoch/medical-research-skills

Fix garbled text in PDF/SVG vector graphics caused by font encoding issues, making files editable in AI tools. Supports batch processing and JSON export for manual correction.

text-to-technical-roadmap

from aipoch/medical-research-skills

Converts research text into a Mermaid technical roadmap flowchart. Use when the user provides research proposals, experiment designs, or scientific text and asks for a roadmap or flowchart.

fulltext-fetcher

from aipoch/medical-research-skills

Fetch and save the original HTML of scientific literature webpages when given a URL, DOI, or PubMed PMID (triggered when you need archival-grade page HTML for downstream parsing or review).

bib-formatter

from aipoch/medical-research-skills

Convert reference lists and in-text citations between RIS, BibTeX, plain text, and CSL-JSON, triggered when you need to unify bibliography/citation styles before journal submission or compare before/after formatting differences.

article-format-adjustment

from aipoch/medical-research-skills

Adjust academic paper formatting and convert between DOCX/LaTeX/Markdown when you need to meet a journal or school template requirement.

unstructured-medical-text-miner

from aipoch/medical-research-skills

Mine unstructured clinical text from MIMIC-IV to extract diagnostic logic.

meta-screening-fulltext

from aipoch/medical-research-skills

Screen full-text papers against inclusion/exclusion criteria, with optional PubMed metadata check using PMID. Use when the user needs to evaluate a paper for a meta-analysis.

citation-formatter

from aipoch/medical-research-skills

Use when formatting references for journal submission, converting between citation styles (APA, MLA, Vancouver, Chicago), generating bibliographies for manuscripts, or ensuring consistent reference formatting. Automatically formats citations and bibliographies in 1000+ academic styles. Ensures reference accuracy, completeness, and compliance with journal requirements. Supports batch conversion and integration with reference managers.

bioinformatics-translational-opportunity-finder

from aipoch/medical-research-skills

Identifies translationally meaningful paths for bioinformatics findings by mapping omics or computational discoveries to diagnosis, stratification, prognosis, treatment-response, monitoring, or target-nomination use cases, while auditing bridge evidence, assayability, and validation burden. Use this skill when a user wants to know whether a bioinformatics finding can be framed as a stronger translational topic without overclaiming clinical relevance. Always separate statistical signal from translational value, and never imply clinical utility, targetability, or validation depth without explicit evidence support.

latex-manuscript-format-converter

from aipoch/medical-research-skills

Converts existing manuscript content into LaTeX format aligned with a target journal, conference, or template while preserving manuscript meaning and structural integrity.

skill-auditor

from aipoch/medical-research-skills

A comprehensive auditor for any agent skill — including Manus, OpenClaw/ClawHub, Claude, LobeHub, or custom SKILL.md-based skills. Use this skill whenever a user wants to evaluate, audit, review, score, or quality-check an agent skill before publishing, updating, or deploying. Covers two hard veto gates (structural redlines + research integrity redlines), static quality scoring across 25 criteria (ISO 25010 + OpenSSF + Agent), dynamic test input generation, multi-mode execution testing, multi-layer output evaluation with five specialized category rubrics (Evidence Insight / Protocol Design / Data Analysis / Academic Writing / Other), a Research Veto that applies to all four research categories, human eval viewer generation, actionable P0/P1/P2 optimization recommendations, and automatic skill improvement that outputs a polished, production-ready SKILL.md. Also use whenever a user says "audit my skill", "evaluate my skill", "improve my skill", or wants a corrected version after evaluation.

two-sample-mr-research-planner

from aipoch/medical-research-skills

Generates complete two-sample Mendelian randomization (MR) research designs from a user-provided research direction. Use when users want to design, plan, or build a study using two-sample MR to test causal relationships. Triggers:"design a two-sample MR study", "build a publishable MR paper", "test whether this biomarker causally affects this disease", "generate Lite/Standard/Advanced MR plans", "screen multiple exposures with MR", "bidirectional MR design", "causal inference using GWAS summary statistics", or "I want to study X and Y using MR". Always outputs four workload configurations (Lite / Standard / Advanced / Publication+) with a recommended primary plan, step-by-step workflow, figure plan, validation strategy, minimal executable version, and publication upgrade path.