mongodb-schema-design

Master MongoDB schema design and data modeling patterns. Learn embedding vs referencing, relationships, normalization, and schema evolution. Use when designing databases, normalizing data, or optimizing queries.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

mongodb-schema-design is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using mongodb-schema-design should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/mongodb-schema-design/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/data-ai/mongodb-schema-design/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/mongodb-schema-design/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How mongodb-schema-design Compares

Feature / Agent	mongodb-schema-design	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# MongoDB Schema Design

Master data modeling and schema patterns.

## Quick Start

### One-to-One: Embedded
```javascript
// User with single address - embed if always accessed together
{
  _id: ObjectId('...'),
  name: 'John',
  email: 'john@example.com',
  address: {
    street: '123 Main St',
    city: 'New York',
    zip: '10001'
  }
}
```

### One-to-Many: Embed Array
```javascript
// User with multiple tags - embed if limited size
{
  _id: ObjectId('...'),
  name: 'John',
  tags: ['mongodb', 'database', 'nosql'],
  posts: [
    { _id: 1, title: 'Post 1', content: '...' },
    { _id: 2, title: 'Post 2', content: '...' }
  ]
}
```

### One-to-Many: Reference
```javascript
// User with many orders - reference if potentially large
{
  _id: ObjectId('user1'),
  name: 'John',
  email: 'john@example.com'
}

// Orders collection
{
  _id: ObjectId('order1'),
  customerId: ObjectId('user1'),
  total: 99.99
}
```

### Many-to-Many: Array of References
```javascript
// Products with categories
{
  _id: ObjectId('product1'),
  name: 'Laptop',
  categoryIds: [
    ObjectId('electronics'),
    ObjectId('computers')
  ]
}

// Categories collection
{
  _id: ObjectId('electronics'),
  name: 'Electronics'
}
```

## Schema Patterns

### Attribute Pattern
```javascript
// Store variant attributes flexibly
{
  _id: ObjectId('...'),
  productName: 'T-Shirt',
  attributes: [
    { key: 'color', value: 'blue' },
    { key: 'size', value: 'L' },
    { key: 'material', value: 'cotton' }
  ]
}
```

### Polymorphic Pattern
```javascript
// Different document types in same collection
{
  _id: ObjectId('...'),
  type: 'email',
  to: 'user@example.com',
  subject: 'Hello'
}

{
  _id: ObjectId('...'),
  type: 'sms',
  phoneNumber: '+1234567890',
  message: 'Hi there'
}
```

### Tree Structures: Adjacency List
```javascript
// Parent-child relationships
{
  _id: ObjectId('...'),
  name: 'Electronics',
  parent: null
}

{
  _id: ObjectId('...'),
  name: 'Computers',
  parent: ObjectId('electronics')
}
```

### Versioned Pattern
```javascript
// Track document history
{
  _id: ObjectId('...'),
  name: 'Product',
  description: 'Latest description',
  versions: [
    { v: 1, name: 'Product', description: 'Original', date: ISODate(...) },
    { v: 2, name: 'Product', description: 'Updated', date: ISODate(...) }
  ]
}
```

## Design Principles

### Embedding Advantages
- Single query to fetch related data
- Atomic updates for related documents
- No joins needed

### Referencing Advantages
- Avoid data duplication
- Smaller documents
- Flexible relationships
- Can grow independently

### Decision Tree
```
Does the related data grow unbounded?
  YES → Use referencing
  NO → Consider embedding

Is the related data frequently accessed separately?
  YES → Use referencing
  NO → Consider embedding

Do updates need to be atomic across documents?
  YES → Use embedding
  NO → Use referencing
```

## Python Design Example

```python
# User with embedded address
users.insert_one({
    'name': 'John',
    'email': 'john@example.com',
    'address': {
        'street': '123 Main St',
        'city': 'New York'
    }
})

# User with references to orders
users.insert_one({
    '_id': ObjectId('...'),
    'name': 'John'
})

orders.insert_one({
    'userId': ObjectId('...'),
    'total': 99.99
})

# Query with $lookup
users.aggregate([
    { '$lookup': {
        'from': 'orders',
        'localField': '_id',
        'foreignField': 'userId',
        'as': 'orders'
    }}
])
```

## Best Practices

✅ Embed when data is always accessed together
✅ Reference for unbounded arrays
✅ Keep document size under 16MB
✅ Consider query patterns when designing
✅ Denormalize carefully for performance
✅ Plan for schema evolution
✅ Use validation schemas
✅ Document your design decisions

Related Skills

mongodb-expert

from diegosouzapw/awesome-omni-skill

Expert-level MongoDB database design, aggregation pipelines, indexing, replication, and production operations

domain-driven-design

from diegosouzapw/awesome-omni-skill

Plan and route Domain-Driven Design work from strategic modeling to tactical implementation and evented architecture patterns.

data-designer

from diegosouzapw/awesome-omni-skill

Generate high-quality synthetic datasets using statistical samplers and Claude's native LLM capabilities. Use when users ask to create synthetic data, generate datasets, create fake/mock data, generate test data, training data, or any data generation task. Supports CSV, JSON, JSONL, Parquet output. Adapted from NVIDIA NeMo DataDesigner (Apache 2.0).

analytics-design

from diegosouzapw/awesome-omni-skill

Design data analysis from purpose clarification to visualization. Use when analyzing data, exploring BigQuery schemas, building queries, or creating Looker Studio reports.

---name: aav-vector-design-agent

from diegosouzapw/awesome-omni-skill

description: AI-powered adeno-associated virus (AAV) vector design for gene therapy including capsid engineering, promoter selection, and tropism optimization.

Schema Migration

from diegosouzapw/awesome-omni-skill

Create safe, zero-downtime schema migrations with rollback procedures

Schema Evolution Impact Analysis

from diegosouzapw/awesome-omni-skill

Analyze the impact of model/schema changes on downstream code — affected repositories, services, handlers, tests, and migration requirements

Schema Design

from diegosouzapw/awesome-omni-skill

Migration-ready database schema design with normalization and indexing strategies

mongodb_usage

from diegosouzapw/awesome-omni-skill

This skill should be used when user asks to "query MongoDB", "show database collections", "get collection schema", "list MongoDB databases", "search records in MongoDB", or "check database indexes".

mongodb

from diegosouzapw/awesome-omni-skill

MongoDB - NoSQL document database with flexible schema design, aggregation pipelines, indexing strategies, and Spring Data integration

database-design

from diegosouzapw/awesome-omni-skill

Database design principles and decision-making. Schema design, indexing strategy, ORM selection, serverless databases.

azure-mgmt-mongodbatlas-dotnet

from diegosouzapw/awesome-omni-skill

Manage MongoDB Atlas Organizations as Azure ARM resources using Azure.ResourceManager.MongoDBAtlas SDK. Use when creating, updating, listing, or deleting MongoDB Atlas organizations through Azure M...