memory-ingest
Ingest a source into any consumer's semantic memory by reading the topology contract
Best use case
memory-ingest is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
It is a strong fit for teams already working in Codex.
Ingest a source into any consumer's semantic memory by reading the topology contract
Teams using memory-ingest should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/memory-ingest/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How memory-ingest Compares
| Feature / Agent | memory-ingest | Standard Approach |
|---|---|---|
| Platform Support | Codex | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Ingest a source into any consumer's semantic memory by reading the topology contract
Which AI agents support this skill?
This skill is designed for Codex.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
Related Guides
Cursor vs Codex for AI Workflows
Compare Cursor and Codex for AI coding workflows, repository assistance, debugging, refactoring, and reusable developer skills.
AI Agents for Coding
Browse AI agent skills for coding, debugging, testing, refactoring, code review, and developer workflows across Claude, Cursor, and Codex.
AI Agents for Marketing
Discover AI agents for marketing workflows, from SEO and content production to campaign research, outreach, and analytics.
SKILL.md Source
# memory-ingest
Ingest an external source into a consumer framework's semantic memory. Reads the consumer's `memory.topology` contract to know where pages live, then extracts, summarizes, integrates, and cross-references — all topology-agnostic.
## When to Use
When new knowledge (a document, paper, URL, config file, or directory of files) needs to enter a consumer's semantic memory. This is the primary write path for external information.
## Parameters
### source (required)
Path to the source material. Supports: markdown (`.md`), PDF (`.pdf`), HTML (`.html`), YAML (`.yaml`/`.yml`), JSON (`.json`), a directory of files, or a URL.
### --consumer (optional)
Consumer ID to ingest into. Resolved via ADR-021 D4 precedence:
1. **Explicit** — `--consumer research-complete`
2. **Wrapper** — set by a calling skill or orchestrator
3. **Auto-detect** — cwd detection or active framework in `.aiwg/frameworks/registry.json`
### --dry-run (optional)
Preview what would be created/modified without writing any files. Outputs the planned page list, cross-references, and contradiction flags.
### --non-interactive (optional)
Skip the discussion step and proceed directly to extraction and page writing. Use for batch ingestion or CI pipelines.
## Operation
### 1. Resolve consumer
Determine which consumer's memory to target using ADR-021 D4 precedence. Fail with a clear error if no consumer can be resolved.
### 2. Load schema
Read `memory.topology` from the consumer's `manifest.json`. Extract:
- `rootDir` — base path for all memory pages
- `derivedPages.summary` — where summary pages are written
- `pageTemplate` — structure the summary must conform to
- `crossRefStyle` — how cross-references are formatted (e.g., wiki-links, markdown links)
- `indexPath` — location of the consumer's memory index
- `log` — path to `.log.jsonl`
- `ingestRequires` — optional list of required post-ingest actions (e.g., `"provenance"`)
### 3. Read source
Parse the source material based on type:
- **Markdown/HTML** — extract text, headings, and structure
- **PDF** — extract text content (use page ranges for large documents)
- **YAML/JSON** — parse structured data, identify key entities
- **Directory** — recursively read all supported files, treating each as a sub-source
- **URL** — fetch content, then parse based on content type
### 4. Discuss (interactive default)
**Default behavior** (no `--non-interactive` flag):
1. Present a concise summary of the source to the user
2. Highlight key takeaways, entities, and concepts found
3. Ask the user what to emphasize, de-prioritize, or reframe
4. Incorporate user guidance into the extraction strategy
This discussion-first pattern ensures the memory reflects human judgment, not just mechanical extraction.
### 5. Extract and summarize
Use LLM to produce a structured summary conforming to the consumer's `pageTemplate`. The summary captures:
- Key claims and findings
- Named entities (people, systems, concepts)
- Relationships between entities
- Source metadata (title, author, date, URI)
### 6. Integrate
- **Write summary page** to `derivedPages.summary` path
- **Update entity/concept pages** — for each entity or concept mentioned, update or create the relevant page under the consumer's entity directory, adding the new information with source attribution
- **Insert cross-references** — link the summary page to entity pages and vice versa, using the consumer's `crossRefStyle`
### 7. Contradiction detection
Compare new claims against existing pages. When a contradiction is found:
- **Flag inline** on the affected existing page using a callout:
```markdown
> [!contradiction]
> Source "paper.pdf" (2026-04-14) claims X, but this page states Y.
> Ingested via memory-ingest — awaiting human resolution.
```
- **Log the contradiction** in `.log.jsonl` with `"contradictions"` count and details
- **Do not auto-resolve** — surface contradictions for human judgment
### 8. Update index
Regenerate the entry for the new summary page in the consumer's index at `indexPath`. Include title, source reference, date, and cross-ref targets.
### 9. Append log
Call `memory-log-append` with:
```
--consumer <resolved> --op ingest --data '{"source":"<path>","pages_touched":[...],"contradictions":<n>,"cross_refs_added":<n>}'
```
### 10. Optional provenance
If `ingestRequires` includes `"provenance"`, create a W3C PROV record documenting:
- `prov:Entity` — the new summary page
- `prov:Activity` — the ingest operation
- `prov:wasDerivedFrom` — the source material
- `prov:wasGeneratedBy` — this skill invocation
- `prov:wasAttributedTo` — the actor (model + user)
### 11. Report
Output a summary:
- Pages created or updated (with paths)
- Contradictions flagged (count and locations)
- Cross-references added (count)
- Provenance record path (if created)
## Error Handling
- **Consumer not found** — fail with actionable message listing available consumers
- **Source unreadable** — fail with format-specific guidance (e.g., "PDF extraction requires the Read tool with page ranges")
- **Schema missing fields** — warn and use sensible defaults; log the gap for `memory-lint` to catch
- **Log write failure** — non-blocking; report primary operation result regardless
## Examples
```
# Interactive ingest of a research paper
memory-ingest docs/papers/distributed-consensus.pdf --consumer research-complete
# Batch ingest a directory of meeting notes
memory-ingest .aiwg/working/meeting-notes/ --consumer sdlc-complete --non-interactive
# Dry run to preview what would change
memory-ingest https://example.com/api-spec.html --consumer sdlc-complete --dry-run
# Explicit consumer override
memory-ingest design-doc.md --consumer media-marketing-kit --non-interactive
```
## Related Skills
- `memory-log-append` — log write primitive (called in step 9)
- `memory-lint` — validates memory page structure and cross-ref integrity
- `memory-query-capture` — captures query patterns for memory optimization
- `provenance-create` — W3C PROV record creation (called in step 10 when required)Related Skills
ralph-memory
Manage Al semantic memory entries — list, query, and clear lessons learned across loop iterations
memory-query-capture
Capture query synthesis as durable pages in semantic memory
memory-log-render
Generate a human-readable Markdown view from a consumer's JSON Lines event log
memory-log-append
Append a structured event to a consumer's semantic memory log
memory-forensics
Volatility 3 memory forensics workflows covering acquisition with LiME and WinPmem, and structured analysis using Volatility 3 plugin reference
kb-ingest
Ingest a source (URL, file, or freeform note) into the knowledge base. Creates a source summary and updates or creates relevant entity and concept pages.
grade-on-ingest
Trigger GRADE quality assessment automatically when new research sources or findings enter the corpus
debug-memory
Query and manage the executable feedback debug memory
aiwg-orchestrate
Route structured artifact work to AIWG workflows via MCP with zero parent context cost
venv-manager
Create, manage, and validate Python virtual environments. Use for project isolation and dependency management.
pytest-runner
Execute Python tests with pytest, supporting fixtures, markers, coverage, and parallel execution. Use for Python test automation.
vitest-runner
Execute JavaScript/TypeScript tests with Vitest, supporting coverage, watch mode, and parallel execution. Use for JS/TS test automation.