data-model-design-patterns

Use when designing, reviewing, or troubleshooting Salesforce object relationships and field type choices — lookup vs master-detail, junction object modeling, indexing strategy, and data model anti-patterns. NOT for object creation steps (use object-creation-and-design). NOT for bulk data loading operations.

Best use case

data-model-design-patterns is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Use when designing, reviewing, or troubleshooting Salesforce object relationships and field type choices — lookup vs master-detail, junction object modeling, indexing strategy, and data model anti-patterns. NOT for object creation steps (use object-creation-and-design). NOT for bulk data loading operations.

Teams using data-model-design-patterns should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/data-model-design-patterns/SKILL.md --create-dirs "https://raw.githubusercontent.com/PranavNagrecha/AwesomeSalesforceSkills/main/skills/data/data-model-design-patterns/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/data-model-design-patterns/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How data-model-design-patterns Compares

Feature / Agentdata-model-design-patternsStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Use when designing, reviewing, or troubleshooting Salesforce object relationships and field type choices — lookup vs master-detail, junction object modeling, indexing strategy, and data model anti-patterns. NOT for object creation steps (use object-creation-and-design). NOT for bulk data loading operations.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Data Model Design Patterns

This skill activates when a practitioner needs to choose or review Salesforce object relationship types, field types, or indexing strategy. It covers the full decision surface from green-field modeling through production performance troubleshooting.

---

## Before Starting

Gather this context before working on anything in this domain:

| Context | What to confirm |
|---|---|
| Object list and volumes | Which objects are involved? Record counts now, in 1 year, in 3 years? |
| Query patterns | What fields appear in WHERE, ORDER BY, or GROUP BY in the most frequent SOQL queries? |
| Relationship requirements | Is cascade delete acceptable? Are rollup summaries needed? Can a child exist without a parent? |
| Integration requirements | Does any external system upsert records? If so, what is the natural key? |
| Common wrong assumption | Lookup relationships do NOT support rollup summary fields; only master-detail children do. |
| Key limits in play | 2 master-detail per child object; 40 lookup per object; 25 external ID fields per object; custom indexes via Salesforce Support case (not self-serve). |

---

## Core Concepts

### Lookup Relationships

A lookup relationship is a loosely coupled reference from a child object to a parent object. The child record can exist without a parent (the parent field is optional unless you mark it required on the page layout). Deleting the parent does not automatically delete child records — the lookup field is simply cleared (or you can configure it to restrict or cascade). Lookups do not support rollup summary fields on the parent.

Each object supports up to 40 lookup relationship fields. Standard relationship fields (OwnerId, CreatedById, LastModifiedById, RecordTypeId) are indexed by default and do not count toward the custom lookup limit.

**Use lookup when:** The child record has independent business meaning, the parent may be deleted without removing children, or you need more than 2 parent references on the same child.

### Master-Detail Relationships (MDR)

A master-detail relationship is a tightly coupled parent-child relationship. The parent field on the child is required — a child record cannot exist without a parent. Deleting the parent cascades and permanently deletes all detail records. Master-detail relationships enable rollup summary fields (COUNT, SUM, MIN, MAX) on the parent object.

Each child object supports a maximum of **2** master-detail relationships. You cannot convert a lookup to master-detail if records already exist with a blank parent field (all existing children must have a populated parent). MDR fields are indexed automatically.

**Cascade delete chain risk**: If Object A is the master of Object B, and Object B is the master of Object C, deleting an A record cascades through B into C. Chains deeper than two levels are a production risk — a single parent delete can remove thousands of grandchild records silently.

**Use master-detail when:** The child's entire existence depends on the parent (e.g., Order Line Item depends on Order), rollup summaries are required, and you are certain cascade delete is the desired behavior.

### Many-to-Many: Junction Objects

Salesforce does not support native many-to-many relationships. The standard pattern is a **junction object**: a custom object with two master-detail relationships, one to each side of the many-to-many. This gives you:

- Rollup summaries on both parent objects (if both sides are MDR)
- Cascade delete: deleting either parent deletes the junction records
- A place to store attributes about the relationship itself (e.g., a role, a quantity, a date)

If you model the junction with two **lookup** fields instead of two MDR fields, you lose rollup summary capability on both sides and the junction records survive even after one parent is deleted — which creates orphan records and referential integrity problems.

### Field Type Selection for Data Integrity

Choosing the wrong field type loses platform-enforced validation, formatting, and UI affordances:

| Data | Correct Type | Wrong Type | What You Lose |
|---|---|---|---|
| Phone number | Phone | Text(255) | Click-to-dial, mobile formatting |
| Email address | Email | Text(255) | Click-to-email, email validation |
| Percentage | Percent | Number | Automatic % display, decimal semantics |
| Currency amounts | Currency | Number | Multi-currency support, locale formatting |
| Long unformatted text | Long Text Area | Text(255) | Storage — Text truncates at 255 chars |
| Structured rich content | Rich Text Area | Long Text Area | Formatting, inline images |
| Unique record key | External ID (+ Unique) | Text | Indexed by default, available for upsert |
| True/False | Checkbox | Picklist | Compact storage, formula compatibility |

### Indexing Strategy

Salesforce automatically indexes a small set of standard fields on every object: `Id`, `Name`, `OwnerId`, `CreatedDate`, `SystemModstamp`, `RecordTypeId`, and all standard and custom relationship fields.

**Custom indexes** extend selective filtering to other fields. They are not self-serve — you request them via a Salesforce Support case. Conditions for a field to be eligible for a custom index:
- Must not be a formula field
- Must not be an encrypted field
- Must not be a multi-select picklist
- The query filter must be selective (typically, the filter must match fewer than 10% of total records, or fewer than 333,000 records for large objects)

**Skinny tables** are a Salesforce-managed performance optimization for large objects with frequent queries on a specific set of fields. A skinny table is a denormalized projection of selected fields, maintained internally. You request skinny tables via Salesforce Support. They are most useful when a single object has millions of records and a small fixed set of non-indexed fields is repeatedly queried together.

---

## Common Patterns

### Pattern: Junction Object for Many-to-Many with MDR Fields

**When to use:** Two objects need a many-to-many relationship and you need rollup summaries or tight referential integrity.

**How it works:**
1. Create a custom junction object (e.g., `Contact_Campaign__c` to link Contact and Campaign).
2. Add a master-detail field to the first parent (e.g., `Contact__c` MDR to Contact).
3. Add a second master-detail field to the second parent (e.g., `Campaign__c` MDR to Campaign).
4. Add any relationship-attribute fields to the junction (e.g., `Role__c`, `Response_Date__c`).
5. Optionally add rollup summary fields on each parent to aggregate junction data.

**Why not two lookups:** Lookup-based junction objects cannot have rollup summary fields on either parent, and orphan junction records accumulate when parents are deleted.

### Pattern: External ID for Integration Upsert

**When to use:** An external system needs to create or update Salesforce records without knowing the Salesforce `Id`, using a natural key from the source system.

**How it works:**
1. Create a custom field with type `Text` (or `Number`), mark it as `External ID` and `Unique`.
2. The field becomes indexed automatically — no Support case required.
3. Use the REST API `upsert` endpoint or Bulk API 2.0 upsert with the external ID field as the key.
4. Salesforce matches on the external ID value: if found, updates; if not found, inserts.

**Limit:** Each object supports up to 25 external ID fields. External ID fields indexed by default; standard text fields are not.

**Why not use Name as the key:** Name is not guaranteed unique and is not marked as an external ID, so the upsert API will not match on it.

### Pattern: Custom Index for Selective Filter on Non-Standard Field

**When to use:** A large object (500k+ records) is queried frequently with a WHERE filter on a field that is not a standard indexed field, and the query is timing out or performing full table scans.

**How it works:**
1. Confirm the filter is selective (the filtered result is less than ~10% of total records, or under 333k records).
2. Confirm the field is not a formula, encrypted, or multi-select picklist.
3. Open a Salesforce Support case requesting a custom index on the specific field and object.
4. After the index is created, re-run the query and confirm execution plan shows index use (use SOQL `EXPLAIN` via Tooling API or Developer Console).

**Why not just add an external ID:** External ID fields are indexed, but they carry uniqueness semantics. Use custom index requests for non-unique filter fields.

---

## Decision Guidance

| Situation | Recommended Approach | Reason |
|---|---|---|
| Child record must not exist without a parent | Master-Detail | Parent field is required; enforce referential integrity at the platform level |
| Rollup summary needed on parent | Master-Detail | Rollup summary fields only work on MDR parent objects |
| Child record has independent meaning; parent may be deleted | Lookup | Child survives parent deletion; parent field is optional |
| Many-to-many relationship required | Junction object with two MDR fields | Enables rollup summaries and cascade delete on both sides |
| Many-to-many but rollup not needed and orphan records are acceptable | Junction object with two Lookups | Looser coupling; junction survives parent deletion |
| Phone or email data to display in UI | Phone / Email field type | Platform formatting, click-to-dial/email, mobile display |
| Integration upsert with a source-system natural key | External ID field (Unique) | Automatically indexed; supported by upsert API endpoints |
| Large object with slow non-indexed filter queries | Custom index (via Support case) | Selective index on the filter field reduces full-table scans |
| Very large object, small set of repeatedly queried fields | Skinny table (via Support) | Denormalized projection eliminates wide-row scans |
| Need to store more than 255 characters | Long Text Area | Text field hard-truncates at 255 characters |

---

## Mode 1: Build from Scratch

Use this mode when modeling a new set of objects or relationships before any data exists.

1. **Map entities and cardinality**: For each entity pair, determine whether the relationship is one-to-one, one-to-many, or many-to-many.
2. **Select relationship type**: Apply the Decision Guidance table above. Default to lookup unless cascade delete + rollup summaries are needed, or until you are certain the child's existence is fully dependent on the parent.
3. **Select field types**: For each attribute, choose the most semantically specific field type (Phone over Text, Currency over Number, etc.).
4. **Plan external IDs**: For every object that an external system will write to, identify the natural key and create an External ID field.
5. **Identify index candidates early**: List fields likely to appear in WHERE clauses. Standard fields are already indexed. Flag any custom fields for a future Support case if the object is expected to grow past 500k records.
6. **Document the model**: Fill in the `data-model-design-patterns-template.md` with the decisions made and their rationale.

## Mode 2: Review an Existing Model

Use this mode when auditing an org's data model for quality or before a major integration or migration project.

1. Run `scripts/check_data_model.py` against the exported metadata to detect structural anti-patterns automatically.
2. Check all junction objects: do they use two MDR fields or two lookups? Flag lookup-based junctions.
3. Check for Text fields storing phone or email data by inspecting field labels and common values.
4. Check external ID coverage: does each object integrated with an external system have an External ID field?
5. Check for MDR chains deeper than 2 levels — these create hidden cascade delete risk.
6. Check the SOQL query log (from Debug Logs or Event Monitoring) for queries performing full table scans on large objects. Cross-reference against available indexes.
7. Fill the Review Checklist below and report findings using the template.

## Mode 3: Troubleshoot

Use this mode when a production issue traces back to the data model.

| Symptom | Likely cause / next step |
|---|---|
| Unexpected record deletion | Check for MDR cascade delete chains. A deleted grandparent may have silently removed grandchildren. |
| Rollup summary not available | Relationship is likely a lookup, not a master-detail. Rollups require master-detail. |
| SOQL timeout or governor limit on large object | Check query execution plan. If full table scan, identify filter field and request a custom index if eligible. |
| Upsert matching not working | Confirm the target field is marked External ID. Upsert won't match on non-External-ID fields. |
| Missing data after parent delete | Lookup delete behavior may be "Clear" instead of "Cascade" — clarify expected behavior and update. |
| Text field values getting truncated | Field is likely Text(255). Migrate to Long Text Area. |

---


## Recommended Workflow

Step-by-step instructions for an AI agent or practitioner activating this skill:

1. Gather context — confirm the org edition, relevant objects, and current configuration state
2. Review official sources — check the references in this skill's well-architected.md before making changes
3. Implement or advise — apply the patterns from Core Concepts and Common Patterns sections above
4. Validate — run the skill's checker script and verify against the Review Checklist below
5. Document — record any deviations from standard patterns and update the template if needed

---

## Review Checklist

Run through these before marking data model work complete:

- [ ] All many-to-many relationships are modeled with junction objects using two MDR fields (not two lookups) unless rollup summaries are explicitly not required
- [ ] No MDR chain is deeper than 2 levels (master → detail → detail-of-detail)
- [ ] All objects integrated with external systems have at least one External ID field (Unique)
- [ ] Phone, Email, Currency, and Percent data uses the correct platform field type (not Text)
- [ ] Fields expected to appear in large-volume WHERE clauses are either standard-indexed or have a custom index request filed
- [ ] Large objects (500k+ records) with narrow repeated query patterns have a skinny table request evaluated
- [ ] The data model template is filled and reviewed with the stakeholder

---

## Salesforce-Specific Gotchas

Non-obvious platform behaviors that cause real production problems:

1. **MDR cascade delete is silent and permanent** — Deleting a master record immediately and permanently deletes all detail records, including any detail-of-detail records two levels down. There is no recycle bin recovery for cascade-deleted records. A single accidental parent delete can wipe thousands of related records.

2. **Converting lookup to master-detail requires all children to have a parent** — You cannot convert an existing lookup field to master-detail if any child records have a null parent field. You must first populate the parent field on all records, then perform the conversion — which may require a data migration step on large datasets.

3. **Lookup filter conditions add to query cost, not reduce it** — Lookup filters restrict which records appear in the lookup search but do not create a database index. If the filter references a non-indexed field on the parent object, the lookup search itself may perform slowly at scale.

4. **External ID limit is 25 per object, but only 3 are indexed in older API versions** — While the current platform allows up to 25 external ID fields per object (all indexed), integrations using older API versions (pre-Spring '16) may only respect the first 3. Always verify API version compatibility when adding more than 3 external ID fields.

5. **Skinny tables are not updated in real time for all operations** — Skinny tables can lag during bulk data loads. If you run a report or query immediately after a large Bulk API job, the skinny table projection may not reflect the freshest data until the background refresh completes.

6. **Junction object with two lookups loses rollup summary on both sides** — This is the most common many-to-many modeling mistake. Once a junction is built with lookup fields and populated with data, converting to MDR requires clearing and repopulating those fields — a non-trivial data migration.

---

## Output Artifacts

| Artifact | Description |
|---|---|
| `data-model-design-patterns-template.md` | Fill-in-the-blank output document capturing relationship decisions, field type choices, indexing plan, and anti-pattern findings |
| `check_data_model.py` | stdlib Python checker that validates junction object MDR structure in exported Salesforce metadata |

---

## Related Skills

- `object-creation-and-design` — use for step-by-step object creation and page layout setup; this skill covers design decisions, not UI creation steps
- `large-data-volumes` — use when the primary concern is bulk data loading, archival, or Bulk API throughput at scale
- `soql-query-optimization` — use when the focus is query tuning, EXPLAIN plan analysis, or governor limit debugging on existing queries

Related Skills

sandbox-data-masking

8
from PranavNagrecha/AwesomeSalesforceSkills

Use this skill when configuring or reviewing Salesforce Data Mask to protect PII/PHI in partial or full copy sandboxes after a refresh. Trigger keywords: data mask, sandbox masking, PII in sandbox, GDPR sandbox, HIPAA non-production, mask contacts, obfuscate fields non-production. NOT for sandbox refresh mechanics (use sandbox-refresh-and-templates), NOT for production data anonymization, NOT for Shield Platform Encryption at rest.

mfa-enforcement-patterns

8
from PranavNagrecha/AwesomeSalesforceSkills

Design MFA enforcement: auto-enablement, Salesforce Authenticator rollout, exceptions, service accounts, API-only users, SSO interop, and audit. Trigger keywords: MFA, multi-factor, two-factor, Salesforce Authenticator, MFA exception, MFA SSO, api-only MFA. Does NOT cover: end-user password policies, device-trust posture, or non-Salesforce IdP configuration.

gdpr-data-privacy

8
from PranavNagrecha/AwesomeSalesforceSkills

Use this skill when implementing GDPR or CCPA data privacy controls in Salesforce: Individual sObject linkage, consent tracking, Right to Be Forgotten (RTBF) requests, data subject request handling, and Privacy Center configuration. Trigger keywords: GDPR, data privacy, consent management, right to erasure, Individual object, ContactPointConsent, ShouldForget, data subject request, Privacy Center, data portability. NOT for general data quality cleanup, duplicate management, field-level encryption (see platform-encryption skill), or sandbox data masking (see sandbox-data-masking skill).

encrypted-field-query-patterns

8
from PranavNagrecha/AwesomeSalesforceSkills

Design SOQL, filters, reporting, and indexes against Shield Platform Encryption fields. Trigger keywords: Shield Platform Encryption, encrypted field query, probabilistic vs deterministic encryption, encrypted SOQL filter, encrypted field index. Does NOT cover: Classic Encryption (deprecated), field-level security policy, or tenant secret key rotation.

data-classification-labels

8
from PranavNagrecha/AwesomeSalesforceSkills

Classify Salesforce fields by data sensitivity and compliance category using the four built-in classification attributes (SecurityClassification, ComplianceGroup, BusinessOwnerId, BusinessStatus). Covers Metadata API deployment, Tooling API querying, and Einstein Data Detect recommendations. NOT for data masking, Shield Platform Encryption, or runtime access control enforcement.

customer-data-request-workflow

8
from PranavNagrecha/AwesomeSalesforceSkills

Implement GDPR/CCPA data subject rights (access, deletion, rectification) using Salesforce Privacy Center and/or custom workflow. NOT for general backup or org-level data retention policy.

apex-managed-sharing-patterns

8
from PranavNagrecha/AwesomeSalesforceSkills

Grant row-level access programmatically via __Share records when declarative sharing rules cannot express the policy. NOT for OWD, role hierarchy, or criteria-based sharing rule design.

omnistudio-testing-patterns

8
from PranavNagrecha/AwesomeSalesforceSkills

Use when testing or validating OmniStudio components — OmniScript preview, Integration Procedure step debugging, DataRaptor field-mapping validation, and end-to-end UTAM-based automation. NOT for Apex unit testing or standard Flow debugging.

omnistudio-error-handling-patterns

8
from PranavNagrecha/AwesomeSalesforceSkills

Use when designing fault behavior across Integration Procedures, DataRaptors, OmniScripts, and FlexCards — error routing, user-facing messaging, retry semantics, and idempotency. Triggers: 'omnistudio error', 'integration procedure fault', 'dataraptor error handling', 'omniscript retry', 'flexcard action failure'. NOT for general Apex exception design or Flow fault paths.

omnistudio-deployment-datapacks

8
from PranavNagrecha/AwesomeSalesforceSkills

Use when exporting, importing, or version-controlling OmniStudio components using DataPacks via the OmniStudio DataPacks tool or vlocity CLI. Covers DataPack export/import, Git version control integration, CI/CD for OmniStudio. NOT for SFDX-based metadata deployment of non-OmniStudio components.

omnistudio-ci-cd-patterns

8
from PranavNagrecha/AwesomeSalesforceSkills

Use when designing or implementing CI/CD pipelines for OmniStudio components — DataPack export/import, versioning, environment promotion, and automated deployment. NOT for standard Salesforce metadata CI/CD or Apex-only pipelines.

omnistudio-asynchronous-data-operations

8
from PranavNagrecha/AwesomeSalesforceSkills

Use Integration Procedures queues, DataRaptor Chain, and Remote Actions with async patterns for long-running OmniStudio flows. NOT for simple DataRaptor reads.