full-text-search

SQLite FTS5 full-text search with prefix matching, ranked results, and highlighted snippets

7 stars

Best use case

full-text-search is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

SQLite FTS5 full-text search with prefix matching, ranked results, and highlighted snippets

Teams using full-text-search should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/full-text-search/SKILL.md --create-dirs "https://raw.githubusercontent.com/heldernoid/agentic-build-templates/main/projects/web-applications/knowledge-base/skills/full-text-search/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/full-text-search/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How full-text-search Compares

Feature / Agentfull-text-searchStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

SQLite FTS5 full-text search with prefix matching, ranked results, and highlighted snippets

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# full-text-search skill

## Overview

knowledge-base uses SQLite FTS5 (Full-Text Search version 5) to provide fast, typo-tolerant keyword search across all published document titles and summaries. The FTS5 virtual table is kept synchronized with the `docs` table via triggers. Searches return ranked results with `<mark>`-wrapped snippet highlights.

## Search endpoint

**Search published docs**
```bash
curl 'http://localhost:3000/api/pub/search?q=docker+compose&page=1&per_page=10'
```

Query params:
- `q`: search query string (required, 2-200 characters)
- `page`: 1-indexed page number (default: 1)
- `per_page`: results per page, 1-50 (default: 10)
- `category_id`: restrict to a specific category (optional)

Response:
```json
{
  "results": [
    {
      "id": 7,
      "title": "Deploying with Docker Compose",
      "slug": "deploying-with-docker-compose",
      "summary": "This guide walks through deploying Acme App on any Linux server using Docker Compose.",
      "snippet": "...deploy Acme App on any Linux server using <mark>Docker</mark> <mark>Compose</mark>. You will need Docker 24...",
      "category_id": 2,
      "category_name": "Guides",
      "rank": -4.2
    }
  ],
  "total": 3,
  "page": 1,
  "per_page": 10,
  "query": "docker compose"
}
```

## FTS5 schema

```sql
CREATE VIRTUAL TABLE docs_fts USING fts5(
  title,
  summary,
  content='docs',
  content_rowid='id',
  tokenize='porter unicode61'
);
```

The `porter` tokenizer applies Porter stemming so "deploying", "deployed", and "deploy" all match the same stem. `unicode61` handles accented characters and non-ASCII input.

`content='docs'` declares this as a content table - FTS5 reads document text from the `docs` table rather than storing a duplicate copy. The triggers below keep the FTS index in sync.

## Sync triggers

These three triggers keep the FTS5 index synchronized automatically without any manual sync calls.

```sql
-- Insert trigger
CREATE TRIGGER docs_ai AFTER INSERT ON docs BEGIN
  INSERT INTO docs_fts(rowid, title, summary)
  VALUES (new.id, new.title, new.summary);
END;

-- Delete trigger
CREATE TRIGGER docs_ad AFTER DELETE ON docs BEGIN
  INSERT INTO docs_fts(docs_fts, rowid, title, summary)
  VALUES ('delete', old.id, old.title, old.summary);
END;

-- Update trigger
CREATE TRIGGER docs_au AFTER UPDATE ON docs BEGIN
  INSERT INTO docs_fts(docs_fts, rowid, title, summary)
  VALUES ('delete', old.id, old.title, old.summary);
  INSERT INTO docs_fts(rowid, title, summary)
  VALUES (new.id, new.title, new.summary);
END;
```

The delete-then-insert pattern in the update trigger is required by FTS5 content tables because in-place updates are not supported.

## Search query construction

```typescript
function buildFtsQuery(input: string): string {
  // Split on whitespace, remove empty tokens
  const terms = input.trim().split(/\s+/).filter(Boolean);
  // Wrap each term in double quotes to treat as phrase, append * for prefix match
  return terms.map(t => `"${t.replace(/"/g, '')}"*`).join(' ');
}
```

For input `"docker comp"` this produces `"docker"* "comp"*`, which matches "docker", "docker-compose", "compose", and "composition".

## SQL search query

```sql
SELECT
  d.id,
  d.title,
  d.slug,
  d.summary,
  snippet(docs_fts, 1, '<mark>', '</mark>', '...', 20) AS snippet,
  d.category_id,
  c.name AS category_name,
  docs_fts.rank
FROM docs_fts
JOIN docs d ON docs_fts.rowid = d.id
LEFT JOIN categories c ON d.category_id = c.id
WHERE docs_fts MATCH ?
  AND d.status = 'published'
ORDER BY rank
LIMIT ? OFFSET ?;
```

`snippet(docs_fts, 1, '<mark>', '</mark>', '...', 20)` generates a snippet from column index 1 (`summary`), wraps matched terms in `<mark>` tags, uses `...` as the ellipsis separator, and returns up to 20 tokens around each match.

`rank` is a negative float; values closer to 0 are weaker matches. Ordering by `rank` ASC puts the best matches first.

## Total count query

```sql
SELECT COUNT(*) AS total
FROM docs_fts
JOIN docs d ON docs_fts.rowid = d.id
WHERE docs_fts MATCH ?
  AND d.status = 'published';
```

Run this query with the same FTS query string before the paginated results query.

## Prefix matching

Appending `*` to each quoted term enables prefix matching:

| Input | Matches |
|-------|---------|
| `"auth"*` | authentication, authorize, auth |
| `"deploy"*` | deploying, deployment, deployed, deploy |
| `"api"*` | api, apis |

Single-character queries are rejected with a 400 error before reaching FTS5.

## Minimum query length

Reject queries shorter than 2 characters:
```typescript
if (q.trim().length < 2) {
  return res.status(400).json({ error: 'QUERY_TOO_SHORT', message: 'Search query must be at least 2 characters' });
}
```

## Rebuilding the index

The admin endpoint `POST /api/admin/settings/rebuild-fts` performs a full index rebuild:

```sql
INSERT INTO docs_fts(docs_fts) VALUES('rebuild');
```

This re-reads all rows from the `docs` content table and rebuilds the FTS index from scratch. Use this to recover from index corruption or after bulk data imports that bypassed triggers.

After rebuild, update `settings.fts_last_rebuilt` with the current ISO timestamp.

## Disabling search

Set `settings.search_enabled = 0` via the settings API. When disabled:
- `GET /api/pub/search` returns `{ error: 'SEARCH_DISABLED', message: 'Search is not enabled for this site' }` with HTTP 503.
- The public search bar is hidden in the React UI.
- FTS triggers remain active; the index stays current for when search is re-enabled.

## Snippet sanitization

The `snippet()` function output contains `<mark>` tags only. Before sending to the client, strip any other HTML that may be present in the summary:

```typescript
import DOMPurify from 'dompurify';

function sanitizeSnippet(raw: string): string {
  // Allow only <mark> tags in snippets
  return DOMPurify.sanitize(raw, { ALLOWED_TAGS: ['mark'] });
}
```

## Special characters

FTS5 MATCH syntax uses `"`, `*`, `(`, `)`, `-`, and `^` as operators. The query builder already wraps terms in double quotes and strips literal `"` from user input, which neutralizes all operator characters. No additional escaping is needed.

## Performance notes

- The FTS5 index is stored in the same SQLite file as the main tables. On typical knowledge-base sizes (up to 10,000 documents) query latency is under 5 ms.
- WAL mode (`PRAGMA journal_mode=WAL`) must be enabled on the database for concurrent reads during FTS writes.
- FTS5 segment merges run automatically. No manual `OPTIMIZE` calls are needed in normal operation.

## Troubleshooting

**Search returns no results for a known term**

1. Check that the document is published (`status = 'published'`).
2. Run `SELECT * FROM docs_fts WHERE docs_fts MATCH '"term"*';` directly in the SQLite shell.
3. If empty, run the rebuild endpoint: `POST /api/admin/settings/rebuild-fts`.
4. Verify `settings.search_enabled = 1`.

**FTS query syntax error from SQLite**

Cause: user input contained unmatched quotes or FTS operator characters that were not neutralized. Ensure `buildFtsQuery` is called on all user input before passing to MATCH.

**Snippet shows wrong column**

`snippet(docs_fts, 1, ...)` references the second FTS column (index 1 = `summary`). If the schema changes column order, update the column index accordingly.

Related Skills

log-search

7
from heldernoid/agentic-build-templates

Search stored Docker container logs for a text query using container-log-viewer. Covers clv search CLI and the /api/logs/search HTTP endpoint.

Skill: Uptime Monitoring

7
from heldernoid/agentic-build-templates

## Overview

Skill: Status Page

7
from heldernoid/agentic-build-templates

## Overview

Skill: unit-conversion

7
from heldernoid/agentic-build-templates

## Overview

Skill: recipe-scaler

7
from heldernoid/agentic-build-templates

## Overview

reading-list

7
from heldernoid/agentic-build-templates

Operate the reading-list API to save, manage, tag, search, and export articles.

email-digest

7
from heldernoid/agentic-build-templates

Configure, test, and troubleshoot the reading-list daily email digest delivered via nodemailer.

websocket-realtime

7
from heldernoid/agentic-build-templates

Use the WebSocket connection in poll-builder to receive live vote updates. Use when you need to stream real-time poll results, monitor a poll for new votes, or build a live dashboard. Triggers include "live results", "real-time updates", "stream votes", "watch poll", or "WebSocket".

poll-builder

7
from heldernoid/agentic-build-templates

Self-hosted poll creation tool with real-time results. Use when you need to create a poll, check vote counts, close a poll, export results, or get the shareable link for a poll. Triggers include "create poll", "vote", "poll results", "survey", "collect votes", "share poll", or any task involving polling or voting.

Skill: personal-finance

7
from heldernoid/agentic-build-templates

## Overview

Skill: csv-import

7
from heldernoid/agentic-build-templates

## Overview

Skill: Syntax Highlighting

7
from heldernoid/agentic-build-templates

## Purpose