deduplicate
Find and refactor duplicate code. Use this skill when the user wants to find near-duplicate code, check for copy-paste redundancy, or DRY up a codebase — optionally scoped to changed files.
Best use case
deduplicate is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Find and refactor duplicate code. Use this skill when the user wants to find near-duplicate code, check for copy-paste redundancy, or DRY up a codebase — optionally scoped to changed files.
Teams using deduplicate should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/deduplicate/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How deduplicate Compares
| Feature / Agent | deduplicate | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Find and refactor duplicate code. Use this skill when the user wants to find near-duplicate code, check for copy-paste redundancy, or DRY up a codebase — optionally scoped to changed files.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Deduplicate
Find near-duplicate code using tree-sitter semantic similarity analysis, then refactor to eliminate redundancy.
## Process
### 1. Determine scope
Ask yourself: did the user specify files, or do they want to check what's changed?
- **Changed files** — call `git_changes` with no arguments to get the list of files modified on the current branch:
```json
{"op": "git changes"}
```
- **Specific files** — the user named files directly. Use those.
- **Whole codebase** — the user asked for a broad sweep with no file constraint.
### 2. Check index readiness
Before querying, confirm the tree-sitter index is ready:
```json
{"op": "get status"}
```
If the index is not ready, tell the user and wait.
### 3. Find duplicates
Run duplicate detection scoped to the files from step 1.
**For each changed or specified file**, call treesitter scoped to that file:
```json
{"op": "find duplicates", "file": "src/handlers/user.rs"}
```
This finds code in that file that is semantically similar to code elsewhere in the codebase.
**For a whole-codebase sweep**, call without a file:
```json
{"op": "find duplicates", "min_similarity": 0.85, "min_chunk_bytes": 100}
```
Adjust thresholds based on the user's intent:
- Strict (exact copies): `min_similarity: 0.95`
- Default (near duplicates): `min_similarity: 0.85`
- Loose (similar patterns): `min_similarity: 0.70`
### 4. Analyze and report
For each duplicate cluster found, assess:
- **What is duplicated** — summarize the shared logic in one sentence
- **Where it lives** — list every location (file:lines)
- **Severity** — how much code is repeated, and how many copies exist
- **Refactoring opportunity** — propose a concrete extraction: a shared function, trait implementation, helper module, or generic abstraction
Present results grouped by severity (most duplicated first). Skip trivial clusters (boilerplate, single-line patterns, or auto-generated code).
### 5. Refactor
If the user wants to proceed with refactoring:
1. Extract the shared logic into a single location (new function, module, or trait)
2. Replace every duplicate site with a call to the extracted code
3. Run tests after each extraction to confirm nothing broke
4. Re-run duplicate detection on the changed files to verify the duplication is resolved:
```json
{"op": "find duplicates", "file": "src/shared/new_helper.rs"}
```
## Guidelines
- Always scope to changed files when the user says "check my changes" or "what I've been working on" — use `git_changes` to get the file list
- When scoping to changed files, run `find duplicates` once per file — do not run a single unscoped scan and filter afterward
- Report only actionable duplication. Ignore: test fixtures, generated code, trait impl boilerplate, and single-line matches
- Prefer the smallest extraction that removes the duplication. Do not over-abstract
- When refactoring, preserve the public API — callers should not need to change unless the user explicitly wants an API change
- If duplicate code exists across different crates or packages, note the dependency implications before extractingRelated Skills
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
linear
Managing Linear issues, projects, and teams. Use when working with Linear tasks, creating issues, updating status, querying projects, or managing team workflows.
lilhomie
Control HomeKit devices via REST API. Use when controlling lights, switches, scenes, or checking device status in the user's home.
lightfriend-add-frontend-page
Step-by-step guide for adding new pages to the Yew frontend
library-writer
This skill should be used when writing software libraries, packages, or modules following battle-tested patterns for clean, minimal, production-ready code. It applies when creating new libraries, refactoring existing ones, designing library APIs, or when clean, dependency-minimal library code is needed. Triggers on requests like "create a library", "write a package", "design a module API", or mentions of professional library development.
Library Management
User library, favorites, and reading progress
library-doc
Index and search library documentation locally for offline use. Invoke when user asks to index docs, search library topics, or list indexed libraries.
libraries-dependencies-mastery
Complete mastery of essential modern web development libraries and dependencies. Cover Next.js, React, TypeScript, Tailwind CSS, Firebase, Zustand, redux-toolkit, react-hook-form, Zod, shadcn/ui, lucide-react, Stripe, and more. Learn setup, integration patterns, advanced usage, performance optimization, troubleshooting, common pitfalls, and version management. Includes quick reference guides, in-depth tutorials, complete examples for e-commerce and SaaS, configuration files, type definitions, error handling, and production patterns. Master how libraries work together and solve real-world challenges.
librarian
Expert in searching official documentation, APIs, and best practices. Use when you need accurate information from authoritative sources.
librarian-indexer
Meta-skill that indexes, optimizes, and auto-generates Claude skills with GitOps automation, OCA GitHub bot integration, and Odoo developer tools. Use for skill creation, CI/CD workflows, OCA module management, and advanced Odoo development.
libpdf-helper
Work with @libpdf/core - modern TypeScript PDF library for parsing, modifying, and generating PDFs. Use when (1) starting new @libpdf/core project, (2) migrating from pdf-lib/pdf.js/pdfkit, (3) understanding @libpdf/core API, (4) solving PDF tasks (forms, signatures, encryption, merging, text extraction), or (5) choosing between PDF libraries.
lexiang
腾讯乐享知识库 API 接口文档。包含通讯录管理、团队管理、知识库管理、知识节点管理、任务管理、自定义属性管理、操作日志、AI助手、单点登录、素材管理、导出任务管理等接口。