File to Markdown — Skill

## Overview

3,891 stars

Best use case

File to Markdown — Skill is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## Overview

Teams using File to Markdown — Skill should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/file-to-markdown/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/alaminrifat/file-to-markdown/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/file-to-markdown/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How File to Markdown — Skill Compares

Feature / Agent	File to Markdown — Skill	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

## Overview

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

AI Agents for Marketing

Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.

AI Agents for Startups

Explore AI agent skills for startup validation, product research, growth experiments, documentation, and fast execution with small teams.

AI Agents for Coding

Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.

SKILL.md Source

# File to Markdown — Skill

## Overview

Convert files into **clean, structured, AI-ready Markdown** using the `markdown.new` API powered by **Cloudflare Workers AI toMarkdown()**.

Supports 20+ formats including documents, spreadsheets, images, and structured data.

No authentication required (500 requests/day per IP).

---

## When to Use This Skill

Use this skill whenever you need to:

* Extract text from files for LLM processing
* Convert PDFs or Office files into Markdown
* Normalize data into structured text
* Process uploaded user files
* Scrape webpage content into Markdown
* Convert images into AI-generated descriptions + content

Common AI workflows:

* RAG ingestion pipelines
* Knowledge base creation
* Document summarization
* Dataset extraction
* Spreadsheet analysis
* OCR-like extraction from images

---

## Supported Formats

### Documents

* `.pdf`
* `.docx`
* `.odt`

### Spreadsheets

* `.xlsx`
* `.xls`
* `.xlsm`
* `.xlsb`
* `.et`
* `.ods`
* `.numbers`

### Images

* `.jpg`
* `.jpeg`
* `.png`
* `.webp`
* `.svg`

### Text & Structured Data

* `.txt`
* `.md`
* `.csv`
* `.json`
* `.xml`
* `.html`
* `.htm`

Notes:

* Image conversion uses AI object detection + summarization.
* HTML URL conversion uses a web page pipeline.
* Uploaded HTML uses Workers AI conversion.

---

## API Base URL

```
https://markdown.new
```

---

## Endpoints

### 1️⃣ Convert Remote File (Simple GET)

Returns plain Markdown text.

```
GET /:file-url
```

Example:

```bash
curl -s "https://markdown.new/https://example.com/report.pdf"
```

---

### 2️⃣ Convert Remote File (JSON Response)

Returns metadata + Markdown.

```
GET /:file-url?format=json
```

Example:

```bash
curl -s "https://markdown.new/https://example.com/report.pdf?format=json"
```

---

### 3️⃣ Convert Remote File via POST

Use when you want structured JSON response.

```
POST /
Content-Type: application/json
```

Body:

```json
{
  "url": "https://example.com/report.pdf"
}
```

Example:

```bash
curl -s https://markdown.new/ \
  -H "Content-Type: application/json" \
  -d '{"url": "https://example.com/report.pdf"}'
```

---

### 4️⃣ Upload Local File

Use when file is not publicly accessible.

```
POST /convert
multipart/form-data
```

Example:

```bash
curl -s https://markdown.new/convert \
  -F "file=@document.pdf"
```

---

## Response Formats

### URL Conversion Response

```json
{
  "success": true,
  "url": "https://example.com/report.pdf",
  "title": "Quarterly Report",
  "content": "# Quarterly Report\n\n...",
  "method": "Workers AI (file)",
  "duration_ms": 1200,
  "tokens": 850
}
```

---

### Upload Conversion Response

```json
{
  "success": true,
  "data": {
    "title": "Q4 Report",
    "content": "# Q4 Report\n\n...",
    "filename": "report.xlsx",
    "file_type": ".xlsx",
    "tokens": 1250,
    "processing_time_ms": 320
  }
}
```

---

## Best Practices for AI Agents

### Prefer GET for Simple Workflows

Use:

```
GET /:url
```

When:

* You only need Markdown text
* Speed is important
* No metadata required

---

### Prefer POST for Structured Pipelines

Use POST when:

* Metadata is needed
* Token counts are required
* Monitoring or logging is implemented
* Building automation workflows

---

### File Upload Strategy

Use `/convert` only if:

* File is local
* File is private
* File requires authentication to access

Otherwise always prefer URL conversion.

---

## Error Handling Strategy

Agents should:

1. Check `"success": true`
2. Retry once if network failure
3. Validate content length > 0
4. Fallback to alternate extraction if needed

---

## Rate Limits

* 500 requests/day per IP without API key
* No signup required

Agents should:

* Cache results when possible
* Avoid duplicate conversions

---

## Integration Examples

### JavaScript (Node.js)

```js
const res = await fetch("https://markdown.new/", {
  method: "POST",
  headers: { "Content-Type": "application/json" },
  body: JSON.stringify({
    url: "https://example.com/file.pdf"
  })
});

const data = await res.json();
console.log(data.content);
```

---

### Python

```python
import requests

res = requests.post(
    "https://markdown.new/",
    json={"url": "https://example.com/file.pdf"}
)

data = res.json()
print(data["content"])
```

---

## Agent Decision Tree

If user provides:

| Input Type      | Action                 |
| --------------- | ---------------------- |
| Public file URL | Use GET or POST        |
| Local file      | Use POST /convert      |
| Image           | Convert then summarize |
| Spreadsheet     | Convert then analyze   |
| Webpage         | Convert URL HTML       |

---

## Output Expectations

The Markdown should be:

* Clean
* Structured
* AI-friendly
* Minimal noise
* Ready for LLM ingestion

---

## Limitations

* Complex PDF layouts may lose formatting
* Large spreadsheets may be truncated
* Images rely on AI interpretation accuracy
* Token limits may apply

---

## Summary

This skill provides a **universal file-to-Markdown conversion layer** for AI systems with:

* No authentication
* Simple HTTP interface
* Multi-format support
* Structured output
* Fast processing

Ideal for document ingestion, RAG pipelines, and automation agents.

---

Related Skills

filesystem

3891

from openclaw/skills

Advanced filesystem operations for listing files, searching content, batch processing, and directory analysis. Supports recursive search, file type filtering, size analysis, and batch operations like copy/move/delete. Use when you need to: list directory contents, search for files by name or content, analyze directory structures, perform batch file operations, or analyze file sizes and distribution.

General Utilities

file-organizer-skill

3891

from openclaw/skills

Organize files in directories by grouping them into folders based on their extensions or date. Includes Dry-Run, Recursive, and Undo capabilities.

General Utilities

file-upload

3891

from openclaw/skills

上传文件到内部 BS3 存储（免签名）。Use when user asks to upload files, images, documents to storage, or get a shareable URL for a file.

hinge-profile-optimizer

3891

from openclaw/skills

Comprehensive, research-backed Hinge dating profile optimization. Use when someone wants to improve their Hinge profile, audit an existing profile, write better prompts/captions, select and order photos strategically, or understand why they're not getting quality matches. This is the thorough process (~45 mins) - discovery interview, honest market math, photo strategy, copy creation, settings cleanup, and implementation support. Grounded in peer-reviewed behavioral research, platform data, and signaling theory.

static-files

3891

from openclaw/skills

Host static files on subdomains with optional authentication. Use when you need to serve HTML, images, CSS, JS, or any static content on a dedicated subdomain. Supports file upload, basic auth, quota management, and automatic SSL via Caddy. Commands include sf sites (create/list/delete), sf upload (files/directories), sf files (list/delete).

markdown-extract Skill

3891

from openclaw/skills

Extract clean markdown from any URL using the markdown.new API.

Twitter/X Profile Scraper

3891

from openclaw/skills

A browser-based Twitter/X profile discovery and scraping tool.

TikTok Profile Scraper

3891

from openclaw/skills

A browser-based TikTok profile discovery and scraping tool.

Instagram Profile Scraper

3891

from openclaw/skills

A browser-based Instagram profile discovery and scraping tool.

markdown-sync-pro

3891

from openclaw/skills

Markdown 一键同步到 Notion、GitHub Wiki、Medium 等平台

visual-file-sorter

3891

from openclaw/skills

自动遍历下载文件夹或桌面，利用视觉模型“看”文件内容并重命名，最后归档到指定分类目录。

durable-files-weekly-review-public

3891

from openclaw/skills

Run a weekly token-optimization audit for durable instruction files in any OpenClaw workspace, generate a markdown report, and propose approval-gated cleanup actions. Use when users want to keep AGENTS/USER/TOOLS/MEMORY-style docs lean without silent deletions.