agentforce-cost-optimization

Use when Agentforce run costs are climbing, you need to forecast scale, or you want to reduce tokens per conversation without hurting quality. Covers topic design impact on cost, prompt/template reuse, grounding size discipline, caching, and model-tier selection. Triggers: 'agentforce cost', 'tokens per conversation too high', 'reduce agentforce runs spend', 'forecast agentforce scale cost', 'einstein trust layer tokens'. NOT for general LLM pricing strategy outside Salesforce.

8 stars

byPranavNagrecha

View on GitHub Installation ↓

Best use case

agentforce-cost-optimization is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using agentforce-cost-optimization should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/agentforce-cost-optimization/SKILL.md --create-dirs "https://raw.githubusercontent.com/PranavNagrecha/AwesomeSalesforceSkills/main/skills/agentforce/agentforce-cost-optimization/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/agentforce-cost-optimization/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How agentforce-cost-optimization Compares

Feature / Agent	agentforce-cost-optimization	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Agentforce Cost Optimization

Agentforce cost looks like "we'll just pay per run" right up until volume meets reality. A customer-service agent handling 200,000 conversations/month can consume 10× the tokens of a well-tuned version of the same agent — same quality, same topics, different token discipline. The cost drivers are predictable: topic instruction length, prompt template verbosity, grounding payload size, tool-call round-trips, and model tier. None of these are free to change, but they all respond to focused work.

The job is to measure first, then optimize the top three contributors. Most orgs find that topic instructions and grounding dominate — often 60-80% of tokens per conversation. Once those are disciplined, the remaining optimizations (template reuse, model-tier selection) become viable.

---

## Before Starting

- Pull 7 days of Agentforce runs; compute average and p95 token counts per conversation.
- Inventory topics, prompt templates, and grounding sources.
- Confirm model tier currently in use and any rate-limit headroom.
- Confirm business tolerance for quality-vs-cost tradeoffs.

## Core Concepts

### What Tokens Are You Paying For?

Every conversation pays for:

1. **System prompt** — the framework-level Agentforce prompt.
2. **Topic instructions** — active topic's instructions injected verbatim.
3. **Prompt template** — any custom template rendered per turn.
4. **Grounding** — retrieved content from Data Cloud, Knowledge, or explicit variables.
5. **Conversation history** — full turn history on each call.
6. **Tool output** — action results returned into context.

### The 80/20 Rule

For most agents, topic instructions + grounding = 60-80% of token spend. Conversation history grows linearly in long sessions. Tool output is lumpy but occasionally large (SOQL result sets dumped raw into context).

### Reducing Topic Instruction Tokens

- Delete department-name preamble ("As a customer service agent working for Acme Insurance...").
- Collapse redundant examples; 2 good examples outperform 10 mediocre ones.
- Externalize static policy ("always use formal English") into the system prompt instead of per-topic.

### Reducing Grounding Tokens

- Retrieve k=3, not k=10, unless evaluation shows quality improves.
- Chunk sizes: 300-500 tokens usually beats 1000-2000.
- Reranker before final injection when using Data Cloud retrievers.
- Strip boilerplate (legal footers, headers) from Knowledge articles before indexing.

### Conversation History Discipline

Long sessions inflate every turn's token count. Patterns:
- Summarize older turns ("Summary of first 5 turns: …") rather than sending verbatim.
- Archive turns beyond a threshold; keep only the last N in active context.

### Model Tier Selection

Not every action needs the most capable model. Use tiered routing:
- Classification / intent detection → smaller model.
- Reasoning / final response → larger model.
- Tool-calling / structured output → mid-tier is often enough.

### Caching Opportunities

- Topic instructions are stable across conversations — framework should cache; you don't need to change anything unless your template is dynamic.
- Grounding retrieval can cache per query; watch freshness needs.

---

## Common Patterns

### Pattern 1: Topic Instruction Audit And Trim

Per-topic, measure instruction token count. Target 150-300 tokens per topic instruction. Trim anything above 500 without a compelling reason.

### Pattern 2: k-3 Retriever With Reranker

Retrieve 10 candidates; rerank; inject top 3. Cuts grounding tokens 70% vs retrieve-10-inject-10.

### Pattern 3: Conversation Summarization Trigger

After N turns or M tokens of history, replace older turns with a one-line summary.

### Pattern 4: Tiered Model Routing

Route classification / intent steps to a smaller model; reasoning/response to the capable model.

### Pattern 5: Tool Output Projection

When a tool returns a large payload (e.g. SOQL result), project the fields the agent actually needs instead of dumping the full response.

---

## Decision Guidance

| Situation | Recommended Approach | Reason |
|---|---|---|
| Token usage high, unknown contributor | Instrument and measure first | Avoid guessing |
| Topic instructions > 500 tokens | Trim (Pattern 1) | Biggest win |
| Grounding k ≥ 5 without evaluation | Reduce k + rerank (Pattern 2) | Second biggest win |
| Long conversations | Summarize (Pattern 3) | Linear savings per turn |
| Classification step using largest model | Switch to smaller tier (Pattern 4) | Cheap wins |
| Tool returns wide records | Project fields (Pattern 5) | Eliminates silent waste |

## Review Checklist

- [ ] Per-conversation token metrics collected and dashboarded.
- [ ] Top 3 token contributors identified per agent.
- [ ] Topic instruction length audited.
- [ ] Grounding k and chunk size justified.
- [ ] Long-conversation strategy exists.
- [ ] Model tier routing considered.
- [ ] Tool output projection in place.

## Recommended Workflow

1. Measure — 7 days of run data broken down by token source.
2. Identify top 3 contributors.
3. Optimize topic instructions first.
4. Optimize grounding second.
5. Add conversation summarization if sessions are long.
6. Apply tier routing where quality allows.
7. Re-measure; document cost savings.

---

## Salesforce-Specific Gotchas

1. Trust Layer adds tokens — masking, citation, guardrails all add context weight.
2. Grounding sources can include large boilerplate (Knowledge article footers); index selectively.
3. Tool output is counted even if the agent ignores it.
4. Managed topics may have opaque instruction length; audit via runtime logs.
5. Switching model tier changes quality — do not do this without A/B evaluation.

## Proactive Triggers

- Topic instruction > 500 tokens → Flag High.
- Retriever k ≥ 10 without reranker → Flag High.
- Average conversation > 20 turns with no summarization → Flag Medium.
- Classification step on flagship model → Flag Medium.
- Token growth > 15%/month without volume growth → Flag High.

## Output Artifacts

| Artifact | Description |
|---|---|
| Cost model | Tokens per conversation by contributor |
| Optimization plan | Prioritized trim list with expected savings |
| Tier routing design | Step → model mapping |

## Related Skills

- `agentforce/agent-topic-design` — topic structure quality.
- `agentforce/prompt-builder-templates` — prompt template hygiene.
- `agentforce/data-cloud-grounding-for-agentforce` — grounding retrieval.
- `agentforce/agentforce-observability` — measurement infrastructure.

Related Skills

dataraptor-transform-optimization

from PranavNagrecha/AwesomeSalesforceSkills

Use when DataRaptor Transform operations are slow, hit governor limits, or use Apex where formula fields would suffice. Covers formula vs Apex expressions, bulk transform sizing, and chained transform composition. Triggers: 'dataraptor transform slow', 'dataraptor formula vs apex', 'dataraptor bulk transform', 'dr governor limit'. NOT for DataRaptor Extract or Load performance.

flow-performance-optimization

from PranavNagrecha/AwesomeSalesforceSkills

Tune Flow runtime performance: pick Before-Save over After-Save, consolidate Get Records, eliminate loop-DML, cache lookups, split with Scheduled Paths, and measure actual runtime. Covers benchmarking methodology, profiling tools, and the 80/20 wins. NOT for governor-limit math (use flow-governor-limits-deep-dive). NOT for LDV strategy (use flow-large-data-volume-patterns).

flow-get-records-optimization

from PranavNagrecha/AwesomeSalesforceSkills

Optimize Get Records elements in Flow: filter sharpness, field selection, sort-and-limit placement, caching via formula resources, and avoiding repeated queries in loops. Trigger keywords: get records, flow soql, flow query limit, flow performance, record lookup. Does NOT cover Apex SOQL, Data Cloud queries, or external object lookups.

soql-query-optimization

from PranavNagrecha/AwesomeSalesforceSkills

Use when a SOQL query is running slowly, causing timeouts, or returning UNABLE_TO_LOCK_ROW errors in large data volume orgs. Covers index-aware query writing, selectivity rules, the Query Plan tool, skinny tables, and dynamic field-set queries. Triggers: slow soql query, query timeout, non-selective query, query plan tool, index usage, soql optimization, large object performance. NOT for Apex CPU or heap governor limit issues (use apex-cpu-and-heap-optimization) or for writing basic SOQL (use soql-fundamentals).

cpq-performance-optimization

from PranavNagrecha/AwesomeSalesforceSkills

Use when diagnosing or resolving slow CPQ quote calculation, QLEx timeouts, or governor limit errors on large quotes. Trigger keywords: Large Quote Mode, QCP field declaration, quote calculation performance, SBQQ calculation timeout, async pricing. NOT for generic Apex performance tuning, CPQ pricing rule logic design, or billing engine performance.

analytics-dataset-optimization

from PranavNagrecha/AwesomeSalesforceSkills

Use this skill when tuning CRM Analytics dataset performance through field selection, date granularity choices, dataset splitting strategy, and run-budget optimization. Trigger keywords: dataset too many fields, SAQL timeseries slow, epoch vs date storage, dataset field count limit, dataset partition, split dataset by year, CRM Analytics performance tuning. NOT for SOQL optimization, Salesforce report tuning, Data Cloud segmentation performance, or choosing between analytics tools.

license-optimization-strategy

from PranavNagrecha/AwesomeSalesforceSkills

Auditing, right-sizing, and reclaiming Salesforce licenses to reduce cost and ensure compliant allocation. Trigger keywords: license audit, license cost reduction, unused licenses, permission set license, login-based license, inactive users, license reclamation, right-size licenses. NOT for provisioning net-new licenses (contact AE). NOT for Experience Cloud community license troubleshooting. NOT for permission set assignment logic outside of license gating.

fsl-optimization-architecture

from PranavNagrecha/AwesomeSalesforceSkills

Use this skill when designing or evaluating the FSL scheduling engine architecture: optimization mode selection (Global/In-Day/Resource/Reshuffle), ESO adoption strategy, territory sizing for optimization, and fallback planning. Trigger keywords: FSL optimization engine, ESO enhanced scheduling, global optimization timeout, in-day optimization, OAAS architecture, territory optimization design. NOT for admin-level scheduling policy configuration, scheduling rule setup in Setup, or per-appointment scheduling API calls (covered by apex/fsl-scheduling-api).

einstein-bots-to-agentforce-migration

from PranavNagrecha/AwesomeSalesforceSkills

Use when migrating an existing Einstein Bot (legacy or Enhanced) to Agentforce: feature mapping, conversation design translation, cutover planning, hybrid bot/agent architecture, and context handoff. Triggers: 'migrate einstein bot to agentforce', 'convert legacy bot to agentforce', 'einstein bot retiring deadline', 'hybrid bot agentforce pattern', 'bot dialog to topic migration'. NOT for new Agentforce setup with no existing bot — use agentforce/agentforce-agent-creation instead.

data-cloud-grounding-for-agentforce

from PranavNagrecha/AwesomeSalesforceSkills

Use when grounding an Agentforce agent with Data Cloud retrievers, DMO selection, chunking, and freshness windows. Triggers: agent grounding, retriever, DMO, data graph, RAG, vector index, citations. Does NOT cover Data Cloud ingestion pipelines or Data Cloud identity resolution tuning.

agentforce-tool-use-patterns

from PranavNagrecha/AwesomeSalesforceSkills

Pick the right tool shape for each agent action: Apex invocable vs Flow action vs External Service vs Prompt Template vs Data Cloud retrieval. Covers action selection by use case, argument design for LLM clarity, return-shape contracts, error-surfacing, cost implications, and when to chain tools vs keep a single action. NOT for authoring a specific action (use custom-agent-actions-apex). NOT for topic design (use agent-topic-design).

agentforce-testing-strategy

from PranavNagrecha/AwesomeSalesforceSkills

Design Agentforce testing: topic coverage, action unit tests, deterministic golden sets, adversarial prompts, and regression harness. Trigger keywords: agentforce testing, agent eval, agent regression suite, prompt golden set, action unit test agentforce. Does NOT cover: generic LLM evaluation academia, human-labeled RLHF pipelines, or Einstein Classify accuracy.