prompt-assemble
Token-safe prompt assembly with memory orchestration. Use for any agent that needs to construct LLM prompts with memory retrieval. Guarantees no API failure due to token overflow. Implements two-phase context construction, memory safety valve, and hard limits on memory injection.
Best use case
prompt-assemble is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Token-safe prompt assembly with memory orchestration. Use for any agent that needs to construct LLM prompts with memory retrieval. Guarantees no API failure due to token overflow. Implements two-phase context construction, memory safety valve, and hard limits on memory injection.
Teams using prompt-assemble should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/prompt-assemble/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How prompt-assemble Compares
| Feature / Agent | prompt-assemble | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Token-safe prompt assembly with memory orchestration. Use for any agent that needs to construct LLM prompts with memory retrieval. Guarantees no API failure due to token overflow. Implements two-phase context construction, memory safety valve, and hard limits on memory injection.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Prompt Assemble
## Overview
A standardized, token-safe prompt assembly framework that guarantees API stability. Implements **Two-Phase Context Construction** and **Memory Safety Valve** to prevent token overflow while maximizing relevant context.
**Design Goals:**
- ✅ Never fail due to memory-related token overflow
- ✅ Memory is always discardable enhancement, never rigid dependency
- ✅ Token budget decisions centralized at prompt assemble layer
## When to Use
Use this skill when:
1. Building or modifying any agent that constructs prompts
2. Implementing memory retrieval systems
3. Adding new prompt-related logic to existing agents
4. Any scenario where token budget safety is required
## Core Workflow
```
User Input
↓
Need-Memory Decision
↓
Minimal Context Build
↓
Memory Retrieval (Optional)
↓
Memory Summarization
↓
Token Estimation
↓
Safety Valve Decision
↓
Final Prompt → LLM Call
```
## Phase Details
### Phase 0: Base Configuration
```python
# Model Context Windows (2026-02-04)
# - MiniMax-M2.1: 204,000 tokens (default)
# - Claude 3.5 Sonnet: 200,000 tokens
# - GPT-4o: 128,000 tokens
MAX_TOKENS = 204000 # Set to your model's context limit
SAFETY_MARGIN = 0.75 * MAX_TOKENS # Conservative: 75% threshold = 153,000 tokens
MEMORY_TOP_K = 3 # Max 3 memories
MEMORY_SUMMARY_MAX = 3 lines # Max 3 lines per memory
```
**Design Philosophy**:
- Leave 25% buffer for safety (model overhead, estimation errors, spikes)
- Better to underutilize capacity than to overflow
### Phase 1: Minimal Context
- System prompt
- Recent N messages (N=3, trimmed)
- Current user input
- **No memory by default**
### Phase 2: Memory Need Decision
```python
def need_memory(user_input):
triggers = [
"previously",
"earlier we discussed",
"do you remember",
"as I mentioned before",
"continuing from",
"before we",
"last time",
"previously mentioned"
]
for trigger in triggers:
if trigger.lower() in user_input.lower():
return True
return False
```
### Phase 3: Memory Retrieval (Optional)
```python
memories = memory_search(query=user_input, top_k=MEMORY_TOP_K)
for mem in memories:
summarized_memories.append(summarize(mem, max_lines=MEMORY_SUMMARY_MAX))
```
### Phase 4: Token Estimation
Calculate estimated tokens for base_context + summarized_memories.
### Phase 5: Safety Valve (Critical)
```python
if estimated_tokens > SAFETY_MARGIN:
base_context.append("[System Notice] Relevant memory skipped due to token budget.")
return assemble(base_context)
```
**Hard Rules:**
- ❌ Never downgrade system prompt
- ❌ Never truncate user input
- ❌ No "lucky splicing"
- ✅ Only memory layer is expendable
### Phase 6: Final Assembly
```python
final_prompt = assemble(base_context + summarized_memories)
return final_prompt
```
## Memory Data Standards
### Allowed in Long-Term Memory
- ✅ User preferences / identity / long-term goals
- ✅ Confirmed important conclusions
- ✅ System-level settings and rules
### Forbidden in Long-Term Memory
- ❌ Raw conversation logs
- ❌ Reasoning traces
- ❌ Temporary discussions
- ❌ Information recoverable from chat history
## Quick Start
Copy `scripts/prompt_assemble.py` to your agent and use:
```python
from prompt_assemble import build_prompt
# In your agent's prompt construction:
final_prompt = build_prompt(user_input, memory_search_fn, get_recent_dialog_fn)
```
## Resources
### scripts/
- `prompt_assemble.py` - Complete implementation with all phases (PromptAssembler class)
### references/
- `memory_standards.md` - Detailed memory content guidelines
- `token_estimation.md` - Token counting strategiesRelated Skills
use-prompt-templates-generative-ai-on-vertex-ai-go-b5765981
. Optional: To view different results, adjust the prompt, model, or parameters, and click Submit. Learn more about prompting strategies. Learn about responsible AI best practices and Vertex AI
use-prompt-templates-generative-ai-on-vertex-ai-go-b2e80920
. In the animal_activity column, enter
the-ultimate-guide-to-writing-effective-ai-prompts-e33e853e
Write an email to [contact] to schedule a meeting on [day]
the-ultimate-guide-to-writing-effective-ai-prompts-65c2a3a3
write a simple prompt like “Write an email to [contact] welcoming them to the company,” but you can add more context with a persona in order to get a better r
the-2026-guide-to-prompt-engineering-ibm-4a7e73bd
s GPT, DALL-E, Stable Diffusion, Anthropic
10-of-my-most-popular-text-to-image-series-prompts-78b0897e
generate a bunch of images, then you curate the results to handpick the best ones
stable-diffusion-prompt-guide-basic-to-advanced-ex-750475b1
t forget to detail the preferred ... image, with common prompts like front view, side view, back view, looking back, eye contact, from above, portrait, headshot, close-up, and bird
responsible-prompting-course-llm-prompt-templates--cd3cd6fd
.join( [role, do, context, content, dont, output, assessment, iteration ] )\n
r-promptengineering-on-reddit-ai-prompting-tips-fr-cad7c366
write a framework first, then use that framework to generate the content
r-promptengineering-on-reddit-ai-prompting-tips-fr-6be40b35
Assignment: Write an analysis of how automation is changing the job market
r-promptengineering-on-reddit-after-1000-hours-of--e2cf1489
ve got me and my Unicode keyboard. I think I need to get hired because phew if that
prompts-workflow
Automated workflow for collecting, converting, and publishing AI prompts to ClawdHub. Collects from multiple sources (Reddit, GitHub, Hacker News, SearXNG), converts prompts into Clawdbot Skills, and publishes them automatically.