File to Markdown — Skill

## Overview

3,891 stars

Best use case

File to Markdown — Skill is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Overview

Teams using File to Markdown — Skill should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/file-to-markdown/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/alaminrifat/file-to-markdown/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/file-to-markdown/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How File to Markdown — Skill Compares

Feature / AgentFile to Markdown — SkillStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

## Overview

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# File to Markdown — Skill

## Overview

Convert files into **clean, structured, AI-ready Markdown** using the `markdown.new` API powered by **Cloudflare Workers AI toMarkdown()**.

Supports 20+ formats including documents, spreadsheets, images, and structured data.

No authentication required (500 requests/day per IP).

---

## When to Use This Skill

Use this skill whenever you need to:

* Extract text from files for LLM processing
* Convert PDFs or Office files into Markdown
* Normalize data into structured text
* Process uploaded user files
* Scrape webpage content into Markdown
* Convert images into AI-generated descriptions + content

Common AI workflows:

* RAG ingestion pipelines
* Knowledge base creation
* Document summarization
* Dataset extraction
* Spreadsheet analysis
* OCR-like extraction from images

---

## Supported Formats

### Documents

* `.pdf`
* `.docx`
* `.odt`

### Spreadsheets

* `.xlsx`
* `.xls`
* `.xlsm`
* `.xlsb`
* `.et`
* `.ods`
* `.numbers`

### Images

* `.jpg`
* `.jpeg`
* `.png`
* `.webp`
* `.svg`

### Text & Structured Data

* `.txt`
* `.md`
* `.csv`
* `.json`
* `.xml`
* `.html`
* `.htm`

Notes:

* Image conversion uses AI object detection + summarization.
* HTML URL conversion uses a web page pipeline.
* Uploaded HTML uses Workers AI conversion.

---

## API Base URL

```
https://markdown.new
```

---

## Endpoints

### 1️⃣ Convert Remote File (Simple GET)

Returns plain Markdown text.

```
GET /:file-url
```

Example:

```bash
curl -s "https://markdown.new/https://example.com/report.pdf"
```

---

### 2️⃣ Convert Remote File (JSON Response)

Returns metadata + Markdown.

```
GET /:file-url?format=json
```

Example:

```bash
curl -s "https://markdown.new/https://example.com/report.pdf?format=json"
```

---

### 3️⃣ Convert Remote File via POST

Use when you want structured JSON response.

```
POST /
Content-Type: application/json
```

Body:

```json
{
  "url": "https://example.com/report.pdf"
}
```

Example:

```bash
curl -s https://markdown.new/ \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/report.pdf"}'
```

---

### 4️⃣ Upload Local File

Use when file is not publicly accessible.

```
POST /convert
multipart/form-data
```

Example:

```bash
curl -s https://markdown.new/convert \
  -F "file=@document.pdf"
```

---

## Response Formats

### URL Conversion Response

```json
{
  "success": true,
  "url": "https://example.com/report.pdf",
  "title": "Quarterly Report",
  "content": "# Quarterly Report\n\n...",
  "method": "Workers AI (file)",
  "duration_ms": 1200,
  "tokens": 850
}
```

---

### Upload Conversion Response

```json
{
  "success": true,
  "data": {
    "title": "Q4 Report",
    "content": "# Q4 Report\n\n...",
    "filename": "report.xlsx",
    "file_type": ".xlsx",
    "tokens": 1250,
    "processing_time_ms": 320
  }
}
```

---

## Best Practices for AI Agents

### Prefer GET for Simple Workflows

Use:

```
GET /:url
```

When:

* You only need Markdown text
* Speed is important
* No metadata required

---

### Prefer POST for Structured Pipelines

Use POST when:

* Metadata is needed
* Token counts are required
* Monitoring or logging is implemented
* Building automation workflows

---

### File Upload Strategy

Use `/convert` only if:

* File is local
* File is private
* File requires authentication to access

Otherwise always prefer URL conversion.

---

## Error Handling Strategy

Agents should:

1. Check `"success": true`
2. Retry once if network failure
3. Validate content length > 0
4. Fallback to alternate extraction if needed

---

## Rate Limits

* 500 requests/day per IP without API key
* No signup required

Agents should:

* Cache results when possible
* Avoid duplicate conversions

---

## Integration Examples

### JavaScript (Node.js)

```js
const res = await fetch("https://markdown.new/", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    url: "https://example.com/file.pdf"
  })
});

const data = await res.json();
console.log(data.content);
```

---

### Python

```python
import requests

res = requests.post(
    "https://markdown.new/",
    json={"url": "https://example.com/file.pdf"}
)

data = res.json()
print(data["content"])
```

---

## Agent Decision Tree

If user provides:

| Input Type      | Action                 |
| --------------- | ---------------------- |
| Public file URL | Use GET or POST        |
| Local file      | Use POST /convert      |
| Image           | Convert then summarize |
| Spreadsheet     | Convert then analyze   |
| Webpage         | Convert URL HTML       |

---

## Output Expectations

The Markdown should be:

* Clean
* Structured
* AI-friendly
* Minimal noise
* Ready for LLM ingestion

---

## Limitations

* Complex PDF layouts may lose formatting
* Large spreadsheets may be truncated
* Images rely on AI interpretation accuracy
* Token limits may apply

---

## Summary

This skill provides a **universal file-to-Markdown conversion layer** for AI systems with:

* No authentication
* Simple HTTP interface
* Multi-format support
* Structured output
* Fast processing

Ideal for document ingestion, RAG pipelines, and automation agents.

---

Related Skills

filesystem

3891
from openclaw/skills

Advanced filesystem operations for listing files, searching content, batch processing, and directory analysis. Supports recursive search, file type filtering, size analysis, and batch operations like copy/move/delete. Use when you need to: list directory contents, search for files by name or content, analyze directory structures, perform batch file operations, or analyze file sizes and distribution.

General Utilities

file-organizer-skill

3891
from openclaw/skills

Organize files in directories by grouping them into folders based on their extensions or date. Includes Dry-Run, Recursive, and Undo capabilities.

General Utilities

file-upload

3891
from openclaw/skills

上传文件到内部 BS3 存储(免签名)。Use when user asks to upload files, images, documents to storage, or get a shareable URL for a file.

hinge-profile-optimizer

3891
from openclaw/skills

Comprehensive, research-backed Hinge dating profile optimization. Use when someone wants to improve their Hinge profile, audit an existing profile, write better prompts/captions, select and order photos strategically, or understand why they're not getting quality matches. This is the thorough process (~45 mins) - discovery interview, honest market math, photo strategy, copy creation, settings cleanup, and implementation support. Grounded in peer-reviewed behavioral research, platform data, and signaling theory.

static-files

3891
from openclaw/skills

Host static files on subdomains with optional authentication. Use when you need to serve HTML, images, CSS, JS, or any static content on a dedicated subdomain. Supports file upload, basic auth, quota management, and automatic SSL via Caddy. Commands include sf sites (create/list/delete), sf upload (files/directories), sf files (list/delete).

markdown-extract Skill

3891
from openclaw/skills

Extract clean markdown from any URL using the markdown.new API.

Twitter/X Profile Scraper

3891
from openclaw/skills

A browser-based Twitter/X profile discovery and scraping tool.

TikTok Profile Scraper

3891
from openclaw/skills

A browser-based TikTok profile discovery and scraping tool.

Instagram Profile Scraper

3891
from openclaw/skills

A browser-based Instagram profile discovery and scraping tool.

markdown-sync-pro

3891
from openclaw/skills

Markdown 一键同步到 Notion、GitHub Wiki、Medium 等平台

visual-file-sorter

3891
from openclaw/skills

自动遍历下载文件夹或桌面,利用视觉模型“看”文件内容并重命名,最后归档到指定分类目录。

durable-files-weekly-review-public

3891
from openclaw/skills

Run a weekly token-optimization audit for durable instruction files in any OpenClaw workspace, generate a markdown report, and propose approval-gated cleanup actions. Use when users want to keep AGENTS/USER/TOOLS/MEMORY-style docs lean without silent deletions.