data-cloud-grounding-for-agentforce

Use when grounding an Agentforce agent with Data Cloud retrievers, DMO selection, chunking, and freshness windows. Triggers: agent grounding, retriever, DMO, data graph, RAG, vector index, citations. Does NOT cover Data Cloud ingestion pipelines or Data Cloud identity resolution tuning.

8 stars

byPranavNagrecha

View on GitHub Installation ↓

Best use case

data-cloud-grounding-for-agentforce is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using data-cloud-grounding-for-agentforce should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/data-cloud-grounding-for-agentforce/SKILL.md --create-dirs "https://raw.githubusercontent.com/PranavNagrecha/AwesomeSalesforceSkills/main/skills/agentforce/data-cloud-grounding-for-agentforce/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/data-cloud-grounding-for-agentforce/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How data-cloud-grounding-for-agentforce Compares

Feature / Agent	data-cloud-grounding-for-agentforce	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Data Cloud Grounding For Agentforce

## Purpose

Agentforce answers are only as good as the data they can reach. Grounding with
Data Cloud lets an agent retrieve context from unified customer profiles,
engagement events, knowledge articles, and structured or unstructured sources,
then cite them in the answer. Without a deliberate grounding design the agent
either hallucinates (too little context), over-retrieves (latency and cost
spike), or leaks data the calling user should not see (sharing ignored at the
retriever level).

This skill covers picking the right DMOs and data graphs, chunking and
filtering for relevance, enforcing field-level and record-level visibility at
query time, setting a freshness SLA that fits the use case, and returning
answers that cite their sources.

## Recommended Workflow

1. **List the questions the agent must answer.** Work backwards from real user
   utterances. If you cannot list 10 sample questions, grounding is premature.
2. **Map questions to DMOs and data graphs.** For each question, identify the
   DMO(s) and fields required. Promote gaps into Data Cloud ingestion work
   before wiring a retriever.
3. **Pick retriever type per question bucket.** Structured retriever for
   records (account, contact, case). Vector/unstructured retriever for
   Knowledge, call transcripts, documents. Hybrid when both are needed.
4. **Decide chunking.** For unstructured, chunk by semantic boundary (article
   section, call segment) not fixed token count when possible. Preserve a
   stable doc_id + section_id in metadata for citation.
5. **Enforce sharing at retrieval time.** Apply user-context filters so the
   retriever never returns rows the running user cannot see. Never rely on the
   LLM to redact.
6. **Set a freshness SLA.** State how stale data can be before the answer is
   wrong. Align Data Cloud refresh cadence to that SLA, not vice versa.
7. **Return citations.** Every grounded answer should include source doc_ids
   or record Ids the user can open.

## Retriever Selection

| Question Type | Retriever | Notes |
|---|---|---|
| "What is this customer's status?" | Structured (DMO) | Filter by UnifiedIndividualId |
| "What did we tell the customer last?" | Structured (Engagement DMO) | Order by timestamp DESC limit 5 |
| "How do I handle policy X?" | Vector (Knowledge) | Chunk by section |
| "What does the transcript of the last call say?" | Vector + metadata filter | Filter by call_id |
| Blend ("account summary + last case note") | Hybrid | Two retrievers, ranked and fused |

## Grounding Strategy Per Topic

For each topic, classify each fact you want the agent to use:

- **Instructional (in topic prompt):** unchanging, short, domain rules.
- **Grounded (retriever):** account- or case-specific, volatile, or too big
  for a prompt. 
- **Action-derived (from an action call):** live data that must be fetched at
  answer time (balance, entitlement, real-time inventory).

Over-packing the topic prompt with facts is the #1 token waste.

## Sharing Enforcement

Three layers:

1. **Data Cloud data space / sharing rules** — baseline visibility.
2. **Retriever filter** — always pass the calling user's identifiers so the
   retriever limits to rows they are allowed to see.
3. **Agent response scrubbing** — last line of defense, not primary.

If the retriever returns data the user should not see, you have a compliance
incident, not a UX bug.

## Freshness

Ingestion latency + retriever cache TTL = worst-case staleness. State this
number explicitly in the topic design. Examples:

- Agent topic for "what's my order status" — SLA = 5 min; Data Cloud stream
  job must run ≤ 3 min.
- Agent topic for "what did we email last week" — SLA = 24h; daily batch is
  fine.

## Citation Pattern

Every retriever must emit stable ids back to the agent. The agent's response
template then includes "Source: <title> (<id>)". This enables:

- Transparency for the user.
- Debugging for the designer.
- Measurable retrieval quality (did the cited doc actually contain the fact?).

## Anti-Patterns (see references/llm-anti-patterns.md)

- Stuffing facts into topic instructions that belong in a retriever.
- Returning answers with no citations.
- Filtering sharing in the agent response instead of at retrieval.
- Setting retriever k to 20+ "just in case."
- Vectorizing everything, including structured data.

## Official Sources Used

- Agentforce — Ground Your Agent — https://help.salesforce.com/s/articleView?id=sf.agentforce_grounding.htm
- Data Cloud retriever — https://help.salesforce.com/s/articleView?id=sf.c360_a_data_cloud_retriever.htm
- Data Cloud DMOs — https://help.salesforce.com/s/articleView?id=sf.c360_a_data_model_objects.htm
- Salesforce Architects — Data Cloud guidance — https://architect.salesforce.com/

Related Skills

sandbox-data-masking

from PranavNagrecha/AwesomeSalesforceSkills

Use this skill when configuring or reviewing Salesforce Data Mask to protect PII/PHI in partial or full copy sandboxes after a refresh. Trigger keywords: data mask, sandbox masking, PII in sandbox, GDPR sandbox, HIPAA non-production, mask contacts, obfuscate fields non-production. NOT for sandbox refresh mechanics (use sandbox-refresh-and-templates), NOT for production data anonymization, NOT for Shield Platform Encryption at rest.

gdpr-data-privacy

from PranavNagrecha/AwesomeSalesforceSkills

Use this skill when implementing GDPR or CCPA data privacy controls in Salesforce: Individual sObject linkage, consent tracking, Right to Be Forgotten (RTBF) requests, data subject request handling, and Privacy Center configuration. Trigger keywords: GDPR, data privacy, consent management, right to erasure, Individual object, ContactPointConsent, ShouldForget, data subject request, Privacy Center, data portability. NOT for general data quality cleanup, duplicate management, field-level encryption (see platform-encryption skill), or sandbox data masking (see sandbox-data-masking skill).

experience-cloud-security

from PranavNagrecha/AwesomeSalesforceSkills

Use when configuring access controls, sharing, or site security for authenticated or guest Experience Cloud (community) users: external OWD, Sharing Sets, Share Groups, CSP, clickjack protection, guest user record access. NOT for internal sharing model configuration (use sharing-and-visibility).

data-classification-labels

from PranavNagrecha/AwesomeSalesforceSkills

Classify Salesforce fields by data sensitivity and compliance category using the four built-in classification attributes (SecurityClassification, ComplianceGroup, BusinessOwnerId, BusinessStatus). Covers Metadata API deployment, Tooling API querying, and Einstein Data Detect recommendations. NOT for data masking, Shield Platform Encryption, or runtime access control enforcement.

customer-data-request-workflow

from PranavNagrecha/AwesomeSalesforceSkills

Implement GDPR/CCPA data subject rights (access, deletion, rectification) using Salesforce Privacy Center and/or custom workflow. NOT for general backup or org-level data retention policy.

omnistudio-deployment-datapacks

from PranavNagrecha/AwesomeSalesforceSkills

Use when exporting, importing, or version-controlling OmniStudio components using DataPacks via the OmniStudio DataPacks tool or vlocity CLI. Covers DataPack export/import, Git version control integration, CI/CD for OmniStudio. NOT for SFDX-based metadata deployment of non-OmniStudio components.

omnistudio-asynchronous-data-operations

from PranavNagrecha/AwesomeSalesforceSkills

Use Integration Procedures queues, DataRaptor Chain, and Remote Actions with async patterns for long-running OmniStudio flows. NOT for simple DataRaptor reads.

dataraptor-transform-optimization

from PranavNagrecha/AwesomeSalesforceSkills

Use when DataRaptor Transform operations are slow, hit governor limits, or use Apex where formula fields would suffice. Covers formula vs Apex expressions, bulk transform sizing, and chained transform composition. Triggers: 'dataraptor transform slow', 'dataraptor formula vs apex', 'dataraptor bulk transform', 'dr governor limit'. NOT for DataRaptor Extract or Load performance.

dataraptor-patterns

from PranavNagrecha/AwesomeSalesforceSkills

Use when designing or reviewing OmniStudio DataRaptors, especially Extract versus Turbo Extract versus Transform versus Load, field mapping strategy, performance tradeoffs, and when to move work into Integration Procedures or Apex. Triggers: 'DataRaptor Extract', 'Turbo Extract', 'DataRaptor Load', 'DataRaptor Transform', 'OmniStudio data mapping'. NOT for overall OmniScript journey design or Integration Procedure sequencing when the main question is not the DataRaptor shape itself.

lwc-datatable-advanced

from PranavNagrecha/AwesomeSalesforceSkills

Advanced lightning-datatable patterns — inline edit + draftValues, custom cell types via extending LightningDatatable, sortable columns, infinite scroll with onloadmore, row-level errors, and the cost of large data sets. NOT for read-only display of small lists (plain lightning-datatable suffices) or fully custom grids (use a third-party library).

lwc-data-table

from PranavNagrecha/AwesomeSalesforceSkills

Use when designing or reviewing `lightning-datatable` usage in Lightning Web Components, including column configuration, stable `key-field` values, inline editing, row actions, infinite loading, and custom cell types. Triggers: 'lightning datatable inline edit', 'row actions in lwc datatable', 'key field missing', 'infinite loading in datatable'. NOT for highly custom virtualized grids or broad page-performance work outside the datatable boundary.

lwc-custom-datatable-types

from PranavNagrecha/AwesomeSalesforceSkills

Use when you need to extend `lightning-datatable` with custom cell renderings: status pills, progress bars, image thumbnails, action cells, editable pickliststo, rich-text, or any column that `lightning-datatable` does not ship out of the box. Triggers: 'custom cell type lightning datatable', 'progress bar column', 'image column', 'inline edit picklist in datatable', 'rich text column'. NOT for basic datatable usage (see `lwc-data-table`) and NOT for tree-grid or large-dataset virtualization (see `virtualized-lists`).