deduplicate

Find and refactor duplicate code. Use this skill when the user wants to find near-duplicate code, check for copy-paste redundancy, or DRY up a codebase — optionally scoped to changed files.

16 stars

Best use case

deduplicate is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Find and refactor duplicate code. Use this skill when the user wants to find near-duplicate code, check for copy-paste redundancy, or DRY up a codebase — optionally scoped to changed files.

Teams using deduplicate should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/deduplicate/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/deduplicate/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/deduplicate/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How deduplicate Compares

Feature / AgentdeduplicateStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Find and refactor duplicate code. Use this skill when the user wants to find near-duplicate code, check for copy-paste redundancy, or DRY up a codebase — optionally scoped to changed files.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Deduplicate

Find near-duplicate code using tree-sitter semantic similarity analysis, then refactor to eliminate redundancy.

## Process

### 1. Determine scope

Ask yourself: did the user specify files, or do they want to check what's changed?

- **Changed files** — call `git_changes` with no arguments to get the list of files modified on the current branch:

```json
{"op": "git changes"}
```

- **Specific files** — the user named files directly. Use those.
- **Whole codebase** — the user asked for a broad sweep with no file constraint.

### 2. Check index readiness

Before querying, confirm the tree-sitter index is ready:

```json
{"op": "get status"}
```

If the index is not ready, tell the user and wait.

### 3. Find duplicates

Run duplicate detection scoped to the files from step 1.

**For each changed or specified file**, call treesitter scoped to that file:

```json
{"op": "find duplicates", "file": "src/handlers/user.rs"}
```

This finds code in that file that is semantically similar to code elsewhere in the codebase.

**For a whole-codebase sweep**, call without a file:

```json
{"op": "find duplicates", "min_similarity": 0.85, "min_chunk_bytes": 100}
```

Adjust thresholds based on the user's intent:
- Strict (exact copies): `min_similarity: 0.95`
- Default (near duplicates): `min_similarity: 0.85`
- Loose (similar patterns): `min_similarity: 0.70`

### 4. Analyze and report

For each duplicate cluster found, assess:

- **What is duplicated** — summarize the shared logic in one sentence
- **Where it lives** — list every location (file:lines)
- **Severity** — how much code is repeated, and how many copies exist
- **Refactoring opportunity** — propose a concrete extraction: a shared function, trait implementation, helper module, or generic abstraction

Present results grouped by severity (most duplicated first). Skip trivial clusters (boilerplate, single-line patterns, or auto-generated code).

### 5. Refactor

If the user wants to proceed with refactoring:

1. Extract the shared logic into a single location (new function, module, or trait)
2. Replace every duplicate site with a call to the extracted code
3. Run tests after each extraction to confirm nothing broke
4. Re-run duplicate detection on the changed files to verify the duplication is resolved:

```json
{"op": "find duplicates", "file": "src/shared/new_helper.rs"}
```

## Guidelines

- Always scope to changed files when the user says "check my changes" or "what I've been working on" — use `git_changes` to get the file list
- When scoping to changed files, run `find duplicates` once per file — do not run a single unscoped scan and filter afterward
- Report only actionable duplication. Ignore: test fixtures, generated code, trait impl boilerplate, and single-line matches
- Prefer the smallest extraction that removes the duplication. Do not over-abstract
- When refactoring, preserve the public API — callers should not need to change unless the user explicitly wants an API change
- If duplicate code exists across different crates or packages, note the dependency implications before extracting

Related Skills

bgo

10
from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

linear

16
from diegosouzapw/awesome-omni-skill

Managing Linear issues, projects, and teams. Use when working with Linear tasks, creating issues, updating status, querying projects, or managing team workflows.

lilhomie

16
from diegosouzapw/awesome-omni-skill

Control HomeKit devices via REST API. Use when controlling lights, switches, scenes, or checking device status in the user's home.

lightfriend-add-frontend-page

16
from diegosouzapw/awesome-omni-skill

Step-by-step guide for adding new pages to the Yew frontend

library-writer

16
from diegosouzapw/awesome-omni-skill

This skill should be used when writing software libraries, packages, or modules following battle-tested patterns for clean, minimal, production-ready code. It applies when creating new libraries, refactoring existing ones, designing library APIs, or when clean, dependency-minimal library code is needed. Triggers on requests like "create a library", "write a package", "design a module API", or mentions of professional library development.

Library Management

16
from diegosouzapw/awesome-omni-skill

User library, favorites, and reading progress

library-doc

16
from diegosouzapw/awesome-omni-skill

Index and search library documentation locally for offline use. Invoke when user asks to index docs, search library topics, or list indexed libraries.

libraries-dependencies-mastery

16
from diegosouzapw/awesome-omni-skill

Complete mastery of essential modern web development libraries and dependencies. Cover Next.js, React, TypeScript, Tailwind CSS, Firebase, Zustand, redux-toolkit, react-hook-form, Zod, shadcn/ui, lucide-react, Stripe, and more. Learn setup, integration patterns, advanced usage, performance optimization, troubleshooting, common pitfalls, and version management. Includes quick reference guides, in-depth tutorials, complete examples for e-commerce and SaaS, configuration files, type definitions, error handling, and production patterns. Master how libraries work together and solve real-world challenges.

librarian

16
from diegosouzapw/awesome-omni-skill

Expert in searching official documentation, APIs, and best practices. Use when you need accurate information from authoritative sources.

librarian-indexer

16
from diegosouzapw/awesome-omni-skill

Meta-skill that indexes, optimizes, and auto-generates Claude skills with GitOps automation, OCA GitHub bot integration, and Odoo developer tools. Use for skill creation, CI/CD workflows, OCA module management, and advanced Odoo development.

libpdf-helper

16
from diegosouzapw/awesome-omni-skill

Work with @libpdf/core - modern TypeScript PDF library for parsing, modifying, and generating PDFs. Use when (1) starting new @libpdf/core project, (2) migrating from pdf-lib/pdf.js/pdfkit, (3) understanding @libpdf/core API, (4) solving PDF tasks (forms, signatures, encryption, merging, text extraction), or (5) choosing between PDF libraries.

lexiang

16
from diegosouzapw/awesome-omni-skill

腾讯乐享知识库 API 接口文档。包含通讯录管理、团队管理、知识库管理、知识节点管理、任务管理、自定义属性管理、操作日志、AI助手、单点登录、素材管理、导出任务管理等接口。