nosql-databases

Apply NoSQL best practices for MongoDB, Convex, and document databases. Use when designing schemas, writing queries, optimizing performance, or building applications with non-relational databases. Use with database-expert for query optimization and DBA-level tuning (20+ years experience).

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

nosql-databases is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using nosql-databases should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/nosql-databases/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/development/nosql-databases/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/nosql-databases/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How nosql-databases Compares

Feature / Agent	nosql-databases	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# NoSQL Databases (MongoDB, Convex, Document Stores)

**Expertise**: Senior database administrator with 20+ years of experience in document stores, key-value systems, and non-relational data modeling. Focus on query optimization, indexing strategy, and data access best practices.

---

## General NoSQL Principles

### Document Design
- **Embed vs Reference**: Embed when data is always read together and rarely grows unbounded; reference when data is shared, large, or updated independently
- **Avoid unbounded arrays**: Documents with arrays that grow without limit cause performance degradation; use separate collections with references
- **Denormalize for read patterns**: Optimize for how data is read; duplicate when it improves query performance and consistency is acceptable

### Query Patterns
- **Index every query path**: Queries without indexes cause full collection scans; at scale, indexed queries are orders of magnitude faster
- **Project only needed fields**: Reduce network and memory by projecting only required fields (`projection` in MongoDB, selective fields in Convex)
- **Paginate large result sets**: Never `.collect()` or `.find()` without limits when result sets can be large (e.g. >1000 documents)

### Consistency
- **Understand read-your-writes**: Document stores often offer eventual consistency; use appropriate read concern when strong consistency is required
- **Design for idempotency**: Retries and eventual consistency make duplicate operations possible; design mutations to be idempotent

---

## MongoDB

### Index Types and When to Use

| Type | Use Case | Example |
|------|----------|---------|
| **Single-field** | Equality, sort on one field | `{ userId: 1 }` |
| **Compound** | Multi-field queries; order matters | `{ channel: 1, createdAt: -1 }` |
| **Multikey** | Arrays (one index entry per array element) | `{ tags: 1 }` |
| **Text** | Full-text search | `{ content: "text" }` |
| **Geospatial** | Location queries | `2dsphere`, `2d` |

### Index Rules

1. **Index fields in WHERE, sort, and projection**—avoid full collection scans
2. **Compound index order**: equality → sort → range; put most selective fields first
   ```javascript
   // Good for db.collection.find({ channel: "x" }).sort({ createdAt: -1 })
   db.collection.createIndex({ channel: 1, createdAt: -1 });
   ```
3. **Covered queries**: When query + projection use only indexed fields, MongoDB reads only the index (no document fetch)
4. **Avoid low-selectivity operators**: `$nin`, `$ne`, `$exists: false` often match large portions of the index
5. **Limit indexes per collection**: Max 64 indexes; each index adds write cost—measure before adding

### Aggregation Pipeline Optimization
- Use `$match` and `$project` early to reduce documents and fields early in the pipeline
- Use `$indexStats` and `$queryStats` to analyze query patterns and index usage
- Prefer `$lookup` with `pipeline` and `let` for complex joins; avoid unbounded `$lookup` on large collections

### Explain and Profiling
```javascript
db.collection.find({ userId: "x" }).explain("executionStats");
// Check: stage "IXSCAN" (index scan) vs "COLLSCAN" (full scan)
// Review: docsExamined, nReturned, executionTimeMillis
```

### Security
- Use parameterized queries; never concatenate user input into queries
- Apply principle of least privilege for database users
- Validate and sanitize `$where` and aggregation `$function` inputs

---

## Convex

### Schema and Indexes

Indexes are defined in the schema; every query should use an index via `.withIndex()`:

```typescript
// schema.ts
defineSchema({
  messages: defineTable({
    channel: v.string(),
    userId: v.id("users"),
    text: v.string(),
    createdAt: v.number(),
  })
    .index("by_channel", ["channel"])
    .index("by_channel_created", ["channel", "createdAt"])
    .index("by_user", ["userId"]),
});
```

### Query Best Practices

1. **Use `.withIndex()` instead of `.filter()`**: Index-based queries are efficient; `.filter()` scans the table
   ```typescript
   // Good: uses index
   const messages = await ctx.db.query("messages").withIndex("by_channel", q => q.eq("channel", channelId)).collect();
   // Avoid: full table scan
   const messages = await ctx.db.query("messages").filter(q => q.eq(q.field("channel"), channelId)).collect();
   ```

2. **Use `.withSearchIndex()` for full-text**: When you need search, define and use search indexes

3. **Paginate with `.paginate()`**: For large result sets, use pagination; avoid `.collect()` on unbounded queries

4. **Staged indexes for large tables**: When adding indexes to tables with substantial data, use staged indexes to avoid slow backfill during deployment

### Index Removal
- Ensure an index is completely unused before removing it; deployments will delete unused indexes

---

## Other Document Stores (Firestore, DynamoDB, etc.)

### Firestore
- Composite indexes for multi-field queries; define in Firebase Console or `firestore.indexes.json`
- Batch reads with `getAll()` to reduce round trips
- Use `limit()` and `startAfter()` for pagination

### DynamoDB
- Design for single-table access patterns; partition key + sort key define access
- Use GSIs (Global Secondary Indexes) for alternate query patterns
- Avoid scans; use `Query` with key conditions

---

## Performance Checklist

- [ ] Every query path has a supporting index
- [ ] No full collection/table scans in hot paths (verify with explain/profiler)
- [ ] Projections limit returned fields
- [ ] Large result sets use pagination
- [ ] Unbounded arrays avoided in document design
- [ ] Read/write patterns inform embedding vs referencing
- [ ] Mutations are idempotent where retries are possible

Related Skills

nosql-expert

from diegosouzapw/awesome-omni-skill

Expert guidance for distributed NoSQL databases (Cassandra, DynamoDB). Focuses on mental models, query-first modeling, single-table design, and avoiding hot partitions in high-scale systems.

designing-databases

from diegosouzapw/awesome-omni-skill

データベーススキーマ設計と最適化を支援します。正規化戦略、インデックス設計、パフォーマンス最適化を提供します。データモデル設計、データベース構造の最適化が必要な場合に使用してください。

databases

from diegosouzapw/awesome-omni-skill

Work with MongoDB (document database, BSON documents, aggregation pipelines, Atlas cloud) and PostgreSQL (relational database, SQL queries, psql CLI, pgAdmin). Use when designing database schemas, writing queries and aggregations, optimizing indexes for performance, performing database migrations, configuring replication and sharding, implementing backup and restore strategies, managing database users and permissions, analyzing query performance, or administering production databases.

databases-architecture-skill

from diegosouzapw/awesome-omni-skill

Master database design (SQL, NoSQL), system architecture, API design (REST, GraphQL), and building scalable systems. Learn PostgreSQL, MongoDB, system design patterns, and enterprise architectures.

bio-clinical-databases-gnomad-frequencies

from diegosouzapw/awesome-omni-skill

Query gnomAD for population allele frequencies to assess variant rarity. Use when filtering variants by population frequency for rare disease analysis or determining if a variant is common in the general population.

acsets-algebraic-databases

from diegosouzapw/awesome-omni-skill

ACSets (Attributed C-Sets): Algebraic databases as in-memory data structures. Category-theoretic formalism for relational databases generalizing graphs and data frames.

NoSQL Patterns

from diegosouzapw/awesome-omni-skill

Các mẫu tối ưu cho cơ sở dữ liệu NoSQL (MongoDB, Redis và hệ phân tán đa).

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

nodejs-javascript-vitest

from diegosouzapw/awesome-omni-skill

Guidelines for writing Node.js and JavaScript code with Vitest testing Triggers on: **/*.js, **/*.mjs, **/*.cjs

nodejs-best-practices

from diegosouzapw/awesome-omni-skill

Node.js development principles and decision-making. Framework selection, async patterns, security, and architecture. Teaches thinking, not copying.

nodejs-backend-typescript

from diegosouzapw/awesome-omni-skill

Node.js backend development with TypeScript, Express/Fastify servers, routing, middleware, and database integration

nodejs-backend-patterns

from diegosouzapw/awesome-omni-skill

Build production-ready Node.js backend services with Express/Fastify, implementing middleware patterns, error handling, authentication, database integration, and API design best practices. Use when creating Node.js servers, REST APIs, GraphQL backends, or microservices architectures.