save-article-with-images

Save web articles locally with images. Automatically downloads images, generates Markdown, and converts to PDF. Supports WeChat Official Account articles via subagent isolation. Triggers: save article, save this article, download article, clip article, wechat article.

3,891 stars

Best use case

save-article-with-images is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Save web articles locally with images. Automatically downloads images, generates Markdown, and converts to PDF. Supports WeChat Official Account articles via subagent isolation. Triggers: save article, save this article, download article, clip article, wechat article.

Teams using save-article-with-images should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/save-article-with-images/SKILL.md --create-dirs "https://raw.githubusercontent.com/openclaw/skills/main/skills/barryqin9999/save-article-with-images/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/save-article-with-images/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How save-article-with-images Compares

Feature / Agentsave-article-with-imagesStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Save web articles locally with images. Automatically downloads images, generates Markdown, and converts to PDF. Supports WeChat Official Account articles via subagent isolation. Triggers: save article, save this article, download article, clip article, wechat article.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# Save Article with Images

Save web articles to local storage, supporting articles with images. Automatically downloads images, generates Markdown, and converts to PDF.

## Triggers

- "save article"
- "save this article"
- "download article"
- "clip article"

---

## Quick Execution

### Articles Without Images

```
1. Fetch article content (Jina Reader or browser)
2. Save to saved-articles/{title}-{date}.md
3. Send file to Feishu
```

### Articles With Images

```
1. Create directory reports/{article-name}/
2. Create images/ subdirectory
3. Download all images to images/
4. Generate Markdown (relative path references)
5. Convert to PDF
6. Send PDF to Feishu
```

---

## Complete Workflow

### Step 1: Check if Article Has Images

**Methods**:
- Jina Reader returns content with `![Image](URL)` format
- Or original webpage has `<img>` tags

**Decision**:
- Images < 3 → Save Markdown directly, don't download images separately
- Images ≥ 3 → Process with image workflow

---

### Step 2: Create Directory Structure

```bash
mkdir -p ~/.openclaw/workspace/reports/{article-name}/images/
```

**Directory Structure**:
```
reports/{article-name}/
├── {article-name}.md      # Markdown file
├── {article-name}.html    # HTML intermediate (optional)
├── {article-name}.pdf     # Final output (optional)
└── images/                # Image directory
    ├── image1.jpg
    ├── image2.png
    └── ...
```

---

### Step 3: Fetch Article Content

#### Method A: Jina Reader (Recommended)

```bash
curl -s "https://r.jina.ai/URL"
```

**Pros**: Auto-converts to Markdown, extracts image links
**Cons**: Some sites blocked

#### Method B: Browser Fetch

```bash
# Open webpage
browser action=open url=URL

# Get content
browser action=act kind=evaluate fn='() => document.body.innerText'

# Get images
browser action=act kind=evaluate fn='() => {
  const imgs = document.querySelectorAll("img");
  return JSON.stringify(Array.from(imgs).map(img => ({
    src: img.src,
    alt: img.alt
  })));
}'
```

---

### Step 4: Download Images

**Single Image**:

```bash
curl -o "images/image1.jpg" "https://example.com/image.jpg"
```

**Batch Download (Python)**:

```python
import requests
from pathlib import Path

def download_images(image_urls, output_dir):
    """Download image list"""
    output_dir = Path(output_dir)
    output_dir.mkdir(parents=True, exist_ok=True)
    
    for i, url in enumerate(image_urls, 1):
        try:
            # Get extension
            ext = url.split('.')[-1].split('?')[0]
            if ext not in ['jpg', 'jpeg', 'png', 'gif', 'webp']:
                ext = 'jpg'
            
            # Download
            resp = requests.get(url, timeout=30, headers={
                'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36'
            })
            
            if resp.status_code == 200:
                filename = f"image{i}.{ext}"
                (output_dir / filename).write_bytes(resp.content)
                print(f"✅ {filename}")
            else:
                print(f"❌ HTTP {resp.status_code}: {url}")
        except Exception as e:
            print(f"❌ {e}: {url}")

# Usage
# download_images(['url1', 'url2'], 'images/')
```

**Image Naming**:
- Sequential: `image1.jpg`, `image2.png`, ...
- By content: `cover.jpg`, `screenshot.png`, ...

---

### Step 5: Generate Markdown

**Template**:

```markdown
# {Article Title}

> Source: {URL}
> Author: {author}
> Published: {date}

---

![Cover](images/image1.jpg)

{Content}

---

## Images

![Figure 1: {description}](images/image2.jpg)
![Figure 2: {description}](images/image3.png)

---

*Saved: {timestamp}*
```

**Image Reference Format**:
```markdown
![Description](images/filename.ext)
```

---

### Step 6: Convert to PDF (Optional)

**Using Preset Styles**:

```bash
# CSS file
CSS_FILE=~/.openclaw/workspace/templates/mobile-friendly.css

# Convert to HTML
pandoc {article-name}.md -o {article-name}.html --standalone --css=$CSS_FILE

# Generate PDF
weasyprint {article-name}.html {article-name}.pdf
```

**PDF Configuration**:
- Body: 16pt, line-height 1.8
- Page: 6×9 inches, margins 1.5cm
- Font: Noto Sans CJK SC

### ⚠️ Image Overflow Solution (Important)

**Problem**: Images too large (e.g., 1200px wide), exceed PDF page width (~432pt/6 inches)

**Solution**: Create CSS file to limit image max-width

**Required CSS**:
```css
/* Prevent image overflow */
img {
  max-width: 100%;
  height: auto;
  display: block;
  margin: 1em auto;
}

/* Images in images/ directory - 90% width */
img[src^="images/"] {
  max-width: 90%;
  margin: 0.5em auto;
}

/* Body styles */
body {
  max-width: 100%;
  padding: 1cm;
}
```

**Correct PDF Generation Flow**:
```bash
# 1. Create CSS file (in article directory)
cat > style.css << 'EOF'
img { max-width: 100%; height: auto; }
img[src^="images/"] { max-width: 90%; }
EOF

# 2. Generate HTML with CSS
pandoc {article-name}.md -o {article-name}.html --standalone --css=style.css

# 3. Generate PDF
weasyprint {article-name}.html {article-name}.pdf
```

**Key Points**:
- ✅ Must add `max-width: 100%` or `max-width: 90%`
- ✅ Use relative paths `images/xxx.jpg`
- ❌ Don't render images at original size (will overflow)

---

### Step 7: Send to Feishu

**Send Markdown**:
```
message action=send channel=feishu target="user:ou_xxx" filePath="path/to/file.md"
```

**Send PDF**:
```
message action=send channel=feishu target="user:ou_xxx" filePath="path/to/file.pdf"
```

---

## Platform-Specific Handling

| Source | Fetch Method | Image Handling |
|--------|--------------|----------------|
| **Twitter/X** | Jina Reader | Download pbs.twimg.com images |
| **WeChat Official Account** | browser + Camoufox | Download mmbiz.qpic.cn images |
| **General Webpages** | Jina Reader | Download all img tags |
| **Login Required Sites** | browser | User manual screenshot |

---

## Twitter/X Articles

**Image URL Format**:
```
https://pbs.twimg.com/media/XXXXX?format=jpg&name=small
```

**Download Command**:
```bash
# Get best quality
curl -o "images/image1.jpg" "https://pbs.twimg.com/media/XXXXX?format=jpg&name=large"
```

---

## WeChat Official Account Articles

**Problem**: WeChat has anti-hotlinking, direct download fails

**Solutions**:
1. Use browser to open article
2. Save screenshot
3. Or use Camoufox tool

```bash
# Use tool from agent-reach
cd ~/.agent-reach/tools/wechat-article-for-ai
python3 main.py "https://mp.weixin.qq.com/s/ARTICLE_ID"
```

---

## Checklist

After saving, verify:

```
□ Markdown file generated
□ All images downloaded successfully
□ Image relative paths correct
□ Images display correctly (local preview)
□ PDF generated successfully (optional)
□ File sent to Feishu
```

---

## Error Handling

| Error | Cause | Solution |
|-------|-------|----------|
| Image download failed | Anti-hotlinking/Network | Use browser or lower quality |
| PDF generation failed | Missing fonts/dependencies | Check weasyprint installation |
| Markdown images not showing | Path error | Check relative paths |
| Jina Reader blocked | Site restriction | Use browser fetch |

---

## File Locations

| Type | Directory |
|------|-----------|
| Simple articles | `saved-articles/{title}-{date}.md` |
| Articles with images | `reports/{article-name}/` |
| Temporary files | `/tmp/article-{id}/` |

---

*Skill Version: 1.0.0*
*Created: 2026-03-17*

Related Skills

long-article-illustration

3891
from openclaw/skills

长文配图助手。为长篇文章自动划分段落、生成AI配图提示词并调用图像生成工具完成配图。适用场景:(1) 公众号/博客长文需要配图 (2) 用户上传文章希望获得段落配图建议 (3) 批量生成文章插图 (4) 用户提到"文章配图""长文插画""段落配图"等关键词时触发

saved-markdown

3891
from openclaw/skills

Publish Markdown, HTML, and Slides pages, reports, and documents anonymously to https://saved.md. **TRIGGER THIS SKILL when ANY of the following are true:** 1. User asks to CREATE content that is naturally document-like (reports, summaries, guides, docs, pages, tables, analysis) 2. User mentions markdown, publishing, sharing, or making something public 3. The response you're about to generate is structured text with headers, tables, or formatted sections 4. User asks for anything that could be a "page" or "document" output 5. User asks for slides, a slide deck, a presentation, or a "deck" output **When in doubt: TRIGGER.** (User can always choose local-only) **ALWAYS offer four options:** (1) one-shot publish, (2) interactive edit-before-publish, (3) local-only, (4) enhance.

WeChat Article Fetcher - 微信文章抓取技能

3891
from openclaw/skills

微信公众号文章链接处理。当用户发送微信公众号文章链接时,自动获取并提取文章内容。 触发条件:(1) 用户发送 http(s)://mp.weixin.qq.com/s/ 开头的链接 (2) 用户请求获取公众号文章内容

WeChat-article-reader

3891
from openclaw/skills

将微信公众号文章导出为 Markdown 格式。当用户提供微信公众号链接 (mp.weixin.qq.com) 或要求下载/导出/保存微信文章时触发。默认保存到工作空间的 source 目录。

article-tts

3891
from openclaw/skills

拍照或文字转音频:文章照片 OCR 提取文字,或直接接收文字,生成 Microsoft Edge TTS 语音,支持中英文、自动转写、语速调节、逐句拆分。| Capture article photos (OCR) or plain text, generate natural audio via Edge TTS. Bilingual support (EN/ZH), configurable speed, voice, and sentence splitting.

images_generate_grok

3891
from openclaw/skills

使用 Grok Imagine 生成图片的技能。

wechat-article-optimizer

3891
from openclaw/skills

微信公众号文章优化,标题生成、封面建议、排版优化、发布时间推荐。适合自媒体运营、内容创作者。

rednote-images

3795
from openclaw/skills

Generate RedNote image series with structured style and layout choices and bundled generation tooling. Use when the user asks to create RedNote image cards, RedNote cover cards, or social infographic series.

---

3891
from openclaw/skills

name: article-factory-wechat

Content & Documentation

humanizer

3891
from openclaw/skills

Remove signs of AI-generated writing from text. Use when editing or reviewing text to make it sound more natural and human-written. Based on Wikipedia's comprehensive "Signs of AI writing" guide. Detects and fixes patterns including: inflated symbolism, promotional language, superficial -ing analyses, vague attributions, em dash overuse, rule of three, AI vocabulary words, negative parallelisms, and excessive conjunctive phrases.

Content & Documentation

find-skills

3891
from openclaw/skills

Helps users discover and install agent skills when they ask questions like "how do I do X", "find a skill for X", "is there a skill that can...", or express interest in extending capabilities. This skill should be used when the user is looking for functionality that might exist as an installable skill.

General Utilities

tavily-search

3891
from openclaw/skills

Use Tavily API for real-time web search and content extraction. Use when: user needs real-time web search results, research, or current information from the web. Requires Tavily API key.

Data & Research