token-saver-75plus

Always-on token optimization + model routing protocol. Auto-classifies requests (T1-T4), routes execution to the cheapest capable model via sessions_spawn, and applies maximum output compression. Target: 75%+ token savings.

16 stars

bydiegosouzapw

View on GitHub Installation ↓

Best use case

token-saver-75plus is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using token-saver-75plus should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/token-saver-75plus/SKILL.md --create-dirs "https://raw.githubusercontent.com/diegosouzapw/awesome-omni-skill/main/skills/ai-agents/token-saver-75plus/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/token-saver-75plus/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How token-saver-75plus Compares

Feature / Agent	token-saver-75plus	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Token Saver 75+ with Model Routing

## Core Principle
**Understand fully, execute cheaply.** The orchestrator must fully understand the task before routing. Never sacrifice comprehension for speed.

## Request Classifier (silent, every message)

| Tier | Pattern | Orchestrator | Executor |
|---|---|---|---|
| T1 | yes/no, status, trivial facts, quick lookups | Handle alone | — |
| T2 | summaries, how-to, lists, bulk processing, formatting | Handle alone OR spawn Groq | Groq (FREE) |
| T3 | debugging, multi-step, code generation, structured analysis | Orchestrate + spawn | Codex for code, Groq for bulk |
| T4 | strategy, complex decisions, multi-agent coordination, creative | **Spawn Opus** | Opus orchestrates, spawns Codex/Groq from within |

## Model Routing Table

| Model | Use For | Cost | Spawn with |
|---|---|---|---|
| `groq/llama-3.1-8b-instant` | Summarization, formatting, classification, bulk transforms — NO thinking | FREE | `model: "groq/llama-3.1-8b-instant"` |
| `openai/gpt-5.3-codex` | ALL code generation, code review, refactoring | $$$ | `model: "openai/gpt-5.3-codex"` |
| `openai/gpt-5.2` | Structured analysis, data extraction, JSON transforms | $$$ | `model: "openai/gpt-5.2"` |
| `anthropic/claude-opus-4-6` | Strategy, complex orchestration, failure recovery (T4 only) | $$$$ | `model: "anthropic/claude-opus-4-6"` |

## Routing via sessions_spawn

### When to spawn (MANDATORY)
- **Code generation of any kind** → spawn Codex
- **Bulk text processing (>3 items)** → spawn Groq
- **Complex multi-step tasks** → spawn Opus (T4)
- **Simple formatting/rewriting** → spawn Groq

### When NOT to spawn
- T1 questions (yes/no, time, status) — handle directly
- Single tool calls (calendar, web search) — handle directly
- Short responses that need no processing — handle directly

### Spawn patterns

**Groq (free bulk work):**
```
sessions_spawn(
  task: "<clear instruction with all context included>",
  model: "groq/llama-3.1-8b-instant"
)
```

**Codex (all code):**
```
sessions_spawn(
  task: "Write <language> code that <detailed spec>. Include comments. Output the complete file.",
  model: "openai/gpt-5.3-codex"
)
```

**Opus (T4 strategy):**
```
sessions_spawn(
  task: "<full context + goal>. You have full tool access. Use sessions_spawn with Codex for code and Groq for bulk subtasks.",
  model: "anthropic/claude-opus-4-6"
)
```

### Critical spawn rules
1. **Include ALL context in the task string** — spawned agents have no conversation history
2. **Be specific** — vague tasks waste tokens on clarification
3. **One task per spawn** — don't bundle unrelated work
4. **For code: always use Codex** — never write code yourself

## Output Compression (applies to ALL tiers, ALL models)

### Templates
- **STATUS:** OK/WARN/FAIL one-liner
- **CHOICE:** A vs B → Recommend: X (1 line why)
- **CAUSE→FIX→VERIFY:** 3 bullets max
- **RESULT:** data/output directly, no wrap-up

### Rules
- No filler. No restating the question. Lead with the answer.
- Bullets/tables/code > prose.
- Do not narrate routine tool calls.
- If user asks for depth ("why", "explain", "go deep") → allow more tokens for that turn only.

### Budget by tier
| Tier | Max output |
|---|---|
| T1 | 1-3 lines |
| T2 | 5-15 bullets |
| T3 | Structured sections, <400 words |
| T4 | Longer allowed, still dense |

## Tool Gating (before ANY tool call)
1. Already known? → No tool.
2. Batchable? → Parallelize.
3. Can a spawned Groq handle it? → Spawn instead of doing it yourself.
4. Cheapest path? → memory_search > partial read > full read > web.
5. Needed? → Do not fetch "just in case."

## Failure Protocol
- If Groq spawn fails → retry with GPT-5.2
- If Codex spawn fails → retry with GPT-5.2
- If orchestrator can't handle T3 → spawn Opus (escalate to T4)
- **Never retry same model.** Escalate.

## Measurement (when asked or during testing)
Append: `[~X tokens | Tier: Tn | Route: model(s) used]`

Related Skills

anthropic-token-refresh

from diegosouzapw/awesome-omni-skill

Automatically refresh Anthropic Claude setup-token before expiration using browser automation. Use when: (1) Setting up auto token refresh for Claude Max/Pro subscription, (2) Token keeps expiring and causing OpenClaw to stop responding, (3) Want to maintain continuous Claude API access without manual intervention.

bgo

from diegosouzapw/awesome-omni-skill

Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.

Coding & Development

moai-lang-r

from diegosouzapw/awesome-omni-skill

R 4.4+ best practices with testthat 3.2, lintr 3.2, and data analysis patterns.

moai-lang-python

from diegosouzapw/awesome-omni-skill

Python 3.13+ development specialist covering FastAPI, Django, async patterns, data science, testing with pytest, and modern Python features. Use when developing Python APIs, web applications, data pipelines, or writing tests.

moai-icons-vector

from diegosouzapw/awesome-omni-skill

Vector icon libraries ecosystem guide covering 10+ major libraries with 200K+ icons, including React Icons (35K+), Lucide (1000+), Tabler Icons (5900+), Iconify (200K+), Heroicons, Phosphor, and Radix Icons with implementation patterns, decision trees, and best practices.

moai-foundation-trust

from diegosouzapw/awesome-omni-skill

Complete TRUST 4 principles guide covering Test First, Readable, Unified, Secured. Validation methods, enterprise quality gates, metrics, and November 2025 standards. Enterprise v4.0 with 50+ software quality standards references.

moai-foundation-memory

from diegosouzapw/awesome-omni-skill

Persistent memory across sessions using MCP Memory Server for user preferences, project context, and learned patterns

moai-foundation-core

from diegosouzapw/awesome-omni-skill

MoAI-ADK's foundational principles - TRUST 5, SPEC-First TDD, delegation patterns, token optimization, progressive disclosure, modular architecture, agent catalog, command reference, and execution rules for building AI-powered development workflows

moai-cc-claude-md

from diegosouzapw/awesome-omni-skill

Authoring CLAUDE.md Project Instructions. Design project-specific AI guidance, document workflows, define architecture patterns. Use when creating CLAUDE.md files for projects, documenting team standards, or establishing AI collaboration guidelines.

moai-alfred-language-detection

from diegosouzapw/awesome-omni-skill

Auto-detects project language and framework from package.json, pyproject.toml, etc.

mnemonic

from diegosouzapw/awesome-omni-skill

Unified memory system - aggregates communications and AI sessions across all channels into searchable, analyzable memory

mlops

from diegosouzapw/awesome-omni-skill

MLflow, model versioning, experiment tracking, model registry, and production ML systems