Daily Paper Search Skill

## 功能描述

25 stars

Best use case

Daily Paper Search Skill is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

## 功能描述

Teams using Daily Paper Search Skill should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/daily-search/SKILL.md --create-dirs "https://raw.githubusercontent.com/ComeOnOliver/skillshub/main/skills/guhaohao0991/PaperClaw/daily-search/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/daily-search/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How Daily Paper Search Skill Compares

Feature / Agent	Daily Paper Search Skill	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

## 功能描述

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Daily Paper Search Skill

## 功能描述
每日自动检索 arXiv 最新论文，与已评估数据库去重，精选 Top N 论文待评估，发送每日检索摘要。

## 核心流程

```
┌─────────────────────────────────────────────────────────────┐
│  20:00 Asia/Singapore 自动触发                              │
│       ↓                                                     │
│  1. 批量搜索 arXiv (9组预设关键词，每组30篇)                │
│       ↓                                                     │
│  2. 搜索结果去重 (ID + 标准化标题)                          │
│       ↓                                                     │
│  3. 与 evaluated_papers.json 去重                           │
│       ↓                                                     │
│  4. 相关性评分排序                                          │
│       ↓                                                     │
│  5. 选择 Top 3 精选论文                                     │
│       ↓                                                     │
│  6. 下载 PDF + 创建元数据                                   │
│       ↓                                                     │
│  7. 生成待评估任务清单                                      │
│       ↓                                                     │
│  8. 发送如流消息摘要                                        │
│       ↓                                                     │
│  9. Agent 执行 paper-review 深度评估                        │
└─────────────────────────────────────────────────────────────┘
```

## 使用方法

### 手动执行

```bash
# 完整流程（搜索 + 下载 + 发送消息）
python skills/daily-search/scripts/daily_paper_search.py

# 精选 5 篇论文（默认 3 篇）
python skills/daily-search/scripts/daily_paper_search.py --top 5

# 仅搜索，不下载 PDF
python skills/daily-search/scripts/daily_paper_search.py --skip-download

# 干跑模式（仅搜索，不下载不发送）
python skills/daily-search/scripts/daily_paper_search.py --dry-run
```

### 命令行参数

| 参数 | 说明 |
|------|------|
| `--top N` | 精选论文数量（默认 3） |
| `--skip-download` | 跳过 PDF 下载 |
| `--dry-run` | 干跑模式，仅搜索不执行实际操作 |
| `--workspace PATH` | 指定工作空间路径 |

## 输出文件

执行后将生成以下文件：

| 文件 | 路径 | 说明 |
|------|------|------|
| 搜索日志 | `search_logs/YYYY-MM-DD_search_log.json` | 当日搜索统计和去重详情 |
| 待评估清单 | `pending_evaluation_YYYY-MM-DD.json` | Agent 待执行的评估任务 |
| 论文元数据 | `papers/{short_title}/metadata.json` | 每篇精选论文的基础信息 |
| 论文 PDF | `papers/{short_title}/*.pdf` | 下载的论文 PDF |

## 后续评估流程

每日检索完成后，Agent 需要执行以下步骤完成论文评估：

### 步骤 1: 查看待评估清单

```bash
cat workspace/pending_evaluation_YYYY-MM-DD.json
```

### 步骤 2: 对每篇论文执行深度评估

对于清单中的每篇论文，按照 `paper-review` 技能流程执行：

1. **获取 Semantic Scholar 数据**
```bash
python skills/semantic-scholar/semantic_scholar_api.py paper-by-arxiv "[arxiv_id]" --format json > papers/{short_title}/metadata.json
```

2. **阅读论文并撰写总结**
   - 生成 `papers/{short_title}/summary.md`

3. **进行四维评分**
   - 生成 `papers/{short_title}/scores.md`
   - 使用 `<think>` 标签记录推理过程

4. **更新已评估论文数据库**
```bash
python skills/paper-review/scripts/update_registry.py \
  --id "[arxiv_id]" \
  --title "[论文标题]" \
  --short_title "[short_title]" \
  --score "[最终评分]"
```

### 步骤 3: 确认评估完成

检查 `evaluated_papers.json` 确认论文已添加：
```bash
cat workspace/papers/evaluated_papers.json | python -m json.tool | tail -20
```

## 定时任务配置

### OpenClaw Cron 配置

在 Agent 配置中添加定时任务：

```json
{
  "name": "Daily Paper Search",
  "schedule": {
    "kind": "cron",
    "expr": "0 20 * * *",
    "tz": "Asia/Singapore"
  },
  "payload": {
    "kind": "agentTurn",
    "message": "执行每日论文检索任务：运行 daily_paper_search.py 搜索最新论文，然后对精选的 Top 3 论文执行完整的 paper-review 流程（总结、评分、更新数据库）"
  },
  "sessionTarget": "isolated"
}
```

### 系统 Crontab 配置（备选）

```bash
# 编辑 crontab
crontab -e

# 添加定时任务 (20:00 Asia/Singapore = 12:00 UTC)
0 12 * * * cd /home/gem/.openclaw && python skills/daily-search/scripts/daily_paper_search.py >> /var/log/daily_paper_search.log 2>&1
```

## 去重机制说明

### 三层去重策略

1. **搜索结果内部去重** (`search_arxiv.py`)
   - arXiv ID 去重
   - 标准化标题去重（保留版本标识符如 ++、-2）
   - 排除不相关领域

2. **与已评估数据库去重** (`daily_paper_search.py`)
   - 读取 `evaluated_papers.json`
   - 比对 arXiv ID
   - 比对标题（不区分大小写）

3. **写入时去重** (`update_registry.py`)
   - 最后一道防线
   - 防止并发写入重复

## 注意事项

1. **API 限制**: arXiv API 有请求频率限制，脚本已设置 3 秒延迟
2. **网络依赖**: PDF 下载和如流消息发送需要网络连接
3. **评估时间**: 深度评估每篇论文需要 Agent 投入时间，建议每日精选 3 篇
4. **存储空间**: PDF 文件会占用存储空间，定期清理旧论文

## 更新日志

### v1.0 (2026-03-04)
- ✅ 初始版本
- ✅ 批量搜索与去重
- ✅ PDF 下载
- ✅ 如流消息发送
- ✅ 待评估任务清单生成

Related Skills

Daily Logs

from ComeOnOliver/skillshub

Record the user's daily activities, progress, decisions, and learnings in a structured, chronological format.

Research Proposal Generator

from ComeOnOliver/skillshub

Generate high-quality academic research proposals for PhD applications following Nature Reviews-style academic writing conventions.

Paper Slide Deck Generator

from ComeOnOliver/skillshub

Transform academic papers and content into professional slide deck images with automatic figure extraction.

LJG-Xray-Paper: 论文解读

from ComeOnOliver/skillshub

你要做两件事，仅两件：

yt-research

from ComeOnOliver/skillshub

Research competitor YouTube channels, niches, and trending topics for your content strategy. Use this skill whenever the user says "research channels", "analyze competitors", "find trending topics", "niche analysis", "competitive research", "what are other creators doing", "scrape YouTube channels", or wants to understand the competitive landscape for a specific tool or topic area. Use when working with yt research. Trigger with 'yt', 'research'.

creating-github-issues-from-web-research

from ComeOnOliver/skillshub

This skill enhances Claude's ability to conduct web research and translate findings into actionable GitHub issues. It automates the process of extracting key information from web search results and formatting it into a well-structured issue, ready for team action. Use this skill when you need to research a topic and create a corresponding GitHub issue for tracking, collaboration, and task management. Trigger this skill by requesting Claude to "research [topic] and create a ticket" or "find [information] and generate a GitHub issue".

elasticsearch-index-manager

from ComeOnOliver/skillshub

Elasticsearch Index Manager - Auto-activating skill for DevOps Advanced. Triggers on: elasticsearch index manager, elasticsearch index manager Part of the DevOps Advanced skill category.

clade-embeddings-search

from ComeOnOliver/skillshub

Implement tool use (function calling) with Claude to let it execute actions, Use when working with embeddings-search patterns. query databases, call APIs, and interact with external systems. Trigger with "anthropic tool use", "claude function calling", "claude tools", "anthropic structured output with tools".

mgrep-code-search

from ComeOnOliver/skillshub

Semantic code search using mgrep for efficient codebase exploration. This skill should be used when searching or exploring codebases with more than 30 non-gitignored files and/or nested directory structures. It provides natural language semantic search that complements traditional grep/ripgrep for finding features, understanding intent, and exploring unfamiliar code.

defold-assets-search

from ComeOnOliver/skillshub

Searches the Defold Asset Store for community libraries and extensions. Use BEFORE writing custom modules for pathfinding, RNG, UI, save/load, localization, tweening, input handling, etc. Helps find, compare, and install Defold dependencies.

terraform-search-import

from ComeOnOliver/skillshub

Discover existing cloud resources using Terraform Search queries and bulk import them into Terraform management. Use when bringing unmanaged infrastructure under Terraform control, auditing cloud resources, or migrating to IaC.

Paper Summary & Review Skill

from ComeOnOliver/skillshub

## 功能描述