add-datalake-consumer
Adds an event consumer that writes to Azure Data Lake (Parquet) following BI_SALES_RISK plan. Creates events/consumers/[Name]DataLakeCollector.ts subscribing to RabbitMQ, building Parquet rows, writing to /path_prefix/year=YYYY/month=MM/day=DD/. Use when adding DataLakeCollector in logging or similar “event to Data Lake” pipelines.
Best use case
add-datalake-consumer is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Adds an event consumer that writes to Azure Data Lake (Parquet) following BI_SALES_RISK plan. Creates events/consumers/[Name]DataLakeCollector.ts subscribing to RabbitMQ, building Parquet rows, writing to /path_prefix/year=YYYY/month=MM/day=DD/. Use when adding DataLakeCollector in logging or similar “event to Data Lake” pipelines.
Teams using add-datalake-consumer should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/add-datalake-consumer/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How add-datalake-consumer Compares
| Feature / Agent | add-datalake-consumer | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Adds an event consumer that writes to Azure Data Lake (Parquet) following BI_SALES_RISK plan. Creates events/consumers/[Name]DataLakeCollector.ts subscribing to RabbitMQ, building Parquet rows, writing to /path_prefix/year=YYYY/month=MM/day=DD/. Use when adding DataLakeCollector in logging or similar “event to Data Lake” pipelines.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# Add Data Lake Consumer
Event consumer that subscribes to RabbitMQ and writes to **Azure Data Lake** (Parquet). Pattern: logging’s DataLakeCollector for `risk.evaluated` (BI_SALES_RISK_IMPLEMENTATION_PLAN §3.5, §9.1). **BI Sales Risk:** Paths and Parquet columns MUST match `documentation/requirements/BI_SALES_RISK_DATA_LAKE_LAYOUT.md` (§2.1 risk.evaluated, §2.2 ml_outcomes, §4 config).
## 1. Consumer
**Path:** `src/events/consumers/[Name]DataLakeCollector.ts`
- `EventConsumer` with `queue`, `exchange: coder_events`, `bindings`: e.g. `['risk.evaluated','ml.prediction.completed','opportunity.updated','forecast.generated']`.
- Handler: map event to row. For risk.evaluated use columns in Data Lake Layout §2.1. Build path: `{path_prefix}/year={YYYY}/month={MM}/day={DD}/...` (Layout §1).
- Write via `@azure/storage-blob` (BlockBlob) or `@azure/storage-blob` + `parquetjs` (or Arrow) for Parquet. Buffer/batch by time or count if needed.
- Config: `data_lake.connection_string`, `data_lake.container`, `data_lake.path_prefix` (e.g. `/risk_evaluations`).
## 2. Config
**config/default.yaml:**
```yaml
data_lake:
connection_string: ${DATA_LAKE_CONNECTION_STRING}
container: ${DATA_LAKE_CONTAINER:-risk}
path_prefix: ${DATA_LAKE_PATH_PREFIX:-/risk_evaluations}
rabbitmq:
url: ${RABBITMQ_URL}
exchange: coder_events
queue: [module]_data_lake
bindings:
- risk.evaluated
- ml.prediction.completed
# ...
```
**config/schema.json:** add `data_lake` with `connection_string`, `container`, `path_prefix`.
## 3. Server
In `server.ts`: `await dataLakeCollector.start()` after RabbitMQ connect.
## 4. Checklist
- [ ] Consumer in `events/consumers/`, subscribe to RabbitMQ (no Azure Service Bus)
- [ ] Path: `{path_prefix}/year=.../month=.../day=.../`; format Parquet
- [ ] Config: `data_lake.*` and schema; `rabbitmq` queue and bindings
- [ ] Start collector in serverRelated Skills
azure-storage-file-datalake-py
Azure Data Lake Storage Gen2 SDK for Python. Use for hierarchical file systems, big data analytics, and file/directory operations.
bgo
Automates the complete Blender build-go workflow, from building and packaging your extension/add-on to removing old versions, installing, enabling, and launching Blender for quick testing and iteration.
moai-lang-r
R 4.4+ best practices with testthat 3.2, lintr 3.2, and data analysis patterns.
moai-lang-python
Python 3.13+ development specialist covering FastAPI, Django, async patterns, data science, testing with pytest, and modern Python features. Use when developing Python APIs, web applications, data pipelines, or writing tests.
moai-icons-vector
Vector icon libraries ecosystem guide covering 10+ major libraries with 200K+ icons, including React Icons (35K+), Lucide (1000+), Tabler Icons (5900+), Iconify (200K+), Heroicons, Phosphor, and Radix Icons with implementation patterns, decision trees, and best practices.
moai-foundation-trust
Complete TRUST 4 principles guide covering Test First, Readable, Unified, Secured. Validation methods, enterprise quality gates, metrics, and November 2025 standards. Enterprise v4.0 with 50+ software quality standards references.
moai-foundation-memory
Persistent memory across sessions using MCP Memory Server for user preferences, project context, and learned patterns
moai-foundation-core
MoAI-ADK's foundational principles - TRUST 5, SPEC-First TDD, delegation patterns, token optimization, progressive disclosure, modular architecture, agent catalog, command reference, and execution rules for building AI-powered development workflows
moai-cc-claude-md
Authoring CLAUDE.md Project Instructions. Design project-specific AI guidance, document workflows, define architecture patterns. Use when creating CLAUDE.md files for projects, documenting team standards, or establishing AI collaboration guidelines.
moai-alfred-language-detection
Auto-detects project language and framework from package.json, pyproject.toml, etc.
mnemonic
Unified memory system - aggregates communications and AI sessions across all channels into searchable, analyzable memory
mlops
MLflow, model versioning, experiment tracking, model registry, and production ML systems