fabric-lakehouse
Use this skill to get context about Fabric Lakehouse and its features for software systems and AI-powered functions. It offers descriptions of Lakehouse data components, organization with schemas and shortcuts, access control, and code examples. This skill supports users in designing, building, and optimizing Lakehouse solutions using best practices.
Best use case
fabric-lakehouse is best used when you need a repeatable AI agent workflow instead of a one-off prompt.
Use this skill to get context about Fabric Lakehouse and its features for software systems and AI-powered functions. It offers descriptions of Lakehouse data components, organization with schemas and shortcuts, access control, and code examples. This skill supports users in designing, building, and optimizing Lakehouse solutions using best practices.
Teams using fabric-lakehouse should expect a more consistent output, faster repeated execution, less prompt rewriting.
When to use this skill
- You want a reusable workflow that can be run more than once with consistent structure.
When not to use this skill
- You only need a quick one-off answer and do not need a reusable workflow.
- You cannot install or maintain the underlying files, dependencies, or repository context.
Installation
Claude Code / Cursor / Codex
Manual Installation
- Download SKILL.md from GitHub
- Place it in
.claude/skills/fabric-lakehouse/SKILL.mdinside your project - Restart your AI agent — it will auto-discover the skill
How fabric-lakehouse Compares
| Feature / Agent | fabric-lakehouse | Standard Approach |
|---|---|---|
| Platform Support | Not specified | Limited / Varies |
| Context Awareness | High | Baseline |
| Installation Complexity | Unknown | N/A |
Frequently Asked Questions
What does this skill do?
Use this skill to get context about Fabric Lakehouse and its features for software systems and AI-powered functions. It offers descriptions of Lakehouse data components, organization with schemas and shortcuts, access control, and code examples. This skill supports users in designing, building, and optimizing Lakehouse solutions using best practices.
Where can I find the source code?
You can find the source code on GitHub using the link provided at the top of the page.
SKILL.md Source
# When to Use This Skill Use this skill when you need to: - Generate a document or explanation that includes definition and context about Fabric Lakehouse and its capabilities. - Design, build, and optimize Lakehouse solutions using best practices. - Understand the core concepts and components of a Lakehouse in Microsoft Fabric. - Learn how to manage tabular and non-tabular data within a Lakehouse. # Fabric Lakehouse ## Core Concepts ### What is a Lakehouse? Lakehouse in Microsoft Fabric is an item that gives users a place to store their tabular data (like tables) and non-tabular data (like files). It combines the flexibility of a data lake with the management capabilities of a data warehouse. It provides: - **Unified storage** in OneLake for structured and unstructured data - **Delta Lake format** for ACID transactions, versioning, and time travel - **SQL analytics endpoint** for T-SQL queries - **Semantic model** for Power BI integration - Support for other table formats like CSV, Parquet - Support for any file formats - Tools for table optimization and data management ### Key Components - **Delta Tables**: Managed tables with ACID compliance and schema enforcement - **Files**: Unstructured/semi-structured data in the Files section - **SQL Endpoint**: Auto-generated read-only SQL interface for querying - **Shortcuts**: Virtual links to external/internal data without copying - **Fabric Materialized Views**: Pre-computed tables for fast query performance ### Tabular data in a Lakehouse Tabular data in a form of tables are stored under "Tables" folder. Main format for tables in Lakehouse is Delta. Lakehouse can store tabular data in other formats like CSV or Parquet, these formats are only available for Spark querying. Tables can be internal, when data is stored under "Tables" folder, or external, when only reference to a table is stored under "Tables" folder but the data itself is stored in a referenced location. Tables are referenced through Shortcuts, which can be internal (pointing to another location in Fabric) or external (pointing to data stored outside of Fabric). ### Schemas for tables in a Lakehouse When creating a lakehouse, users can choose to enable schemas. Schemas are used to organize Lakehouse tables. Schemas are implemented as folders under the "Tables" folder and store tables inside of those folders. The default schema is "dbo" and it can't be deleted or renamed. All other schemas are optional and can be created, renamed, or deleted. Users can reference a schema located in another lakehouse using a Schema Shortcut, thereby referencing all tables in the destination schema with a single shortcut. ### Files in a Lakehouse Files are stored under "Files" folder. Users can create folders and subfolders to organize their files. Any file format can be stored in Lakehouse. ### Fabric Materialized Views Set of pre-computed tables that are automatically updated based on a schedule. They provide fast query performance for complex aggregations and joins. Materialized views are defined using PySpark or Spark SQL and stored in an associated Notebook. ### Spark Views Logical tables defined by a SQL query. They do not store data but provide a virtual layer for querying. Views are defined using Spark SQL and stored in Lakehouse next to Tables. ## Security ### Item access or control plane security Users can have workspace roles (Admin, Member, Contributor, Viewer) that provide different levels of access to Lakehouse and its contents. Users can also get access permission using sharing capabilities of Lakehouse. ### Data access or OneLake Security For data access use OneLake security model, which is based on Microsoft Entra ID (formerly Azure Active Directory) and role-based access control (RBAC). Lakehouse data is stored in OneLake, so access to data is controlled through OneLake permissions. In addition to object-level permissions, Lakehouse also supports column-level and row-level security for tables, allowing fine-grained control over who can see specific columns or rows in a table. ## Lakehouse Shortcuts Shortcuts create virtual links to data without copying: ### Types of Shortcuts - **Internal**: Link to other Fabric Lakehouses/tables, cross-workspace data sharing - **ADLS Gen2**: Link to ADLS Gen2 containers in Azure - **Amazon S3**: AWS S3 buckets, cross-cloud data access - **Dataverse**: Microsoft Dataverse, business application data - **Google Cloud Storage**: GCS buckets, cross-cloud data access ## Performance Optimization ### V-Order Optimization For faster data read with semantic model enable V-Order optimization on Delta tables. This presorts data in a way that improves query performance for common access patterns. ### Table Optimization Tables can also be optimized using the OPTIMIZE command, which compacts small files into larger ones and can also apply Z-ordering to improve query performance on specific columns. Regular optimization helps maintain performance as data is ingested and updated over time. The Vacuum command can be used to clean up old files and free up storage space, especially after updates and deletes. ## Lineage The Lakehouse item supports lineage, which allows users to track the origin and transformations of data. Lineage information is automatically captured for tables and files in Lakehouse, showing how data flows from source to destination. This helps with debugging, auditing, and understanding data dependencies. ## PySpark Code Examples See [PySpark code](references/pyspark.md) for details. ## Getting data into Lakehouse See [Get data](references/getdata.md) for details.
Related Skills
genderapi-io-automation
Automate Genderapi IO tasks via Rube MCP (Composio). Always search tools first for current schemas.
gender-api-automation
Automate Gender API tasks via Rube MCP (Composio). Always search tools first for current schemas.
fred-economic-data
Query FRED (Federal Reserve Economic Data) API for 800,000+ economic time series from 100+ sources. Access GDP, unemployment, inflation, interest rates, exchange rates, housing, and regional data. Use for macroeconomic analysis, financial research, policy studies, economic forecasting, and academic research requiring U.S. and international economic indicators.
fidel-api-automation
Automate Fidel API tasks via Rube MCP (Composio). Always search tools first for current schemas.
fastapi-templates
Create production-ready FastAPI projects with async patterns, dependency injection, and comprehensive error handling. Use when building new FastAPI applications or setting up backend API projects.
fastapi-router-py
Create FastAPI routers with CRUD operations, authentication dependencies, and proper response models. Use when building REST API endpoints, creating new routes, implementing CRUD operations, or add...
fastapi-pro
Build high-performance async APIs with FastAPI, SQLAlchemy 2.0, and Pydantic V2. Master microservices, WebSockets, and modern Python async patterns.
expo-api-routes
Guidelines for creating API routes in Expo Router with EAS Hosting
esm
Comprehensive toolkit for protein language models including ESM3 (generative multimodal protein design across sequence, structure, and function) and ESM C (efficient protein embeddings and representations). Use this skill when working with protein sequences, structures, or function prediction; designing novel proteins; generating protein embeddings; performing inverse folding; or conducting protein engineering tasks. Supports both local model usage and cloud-based Forge API for scalable inference.
eodhd-apis-automation
Automate Eodhd Apis tasks via Rube MCP (Composio). Always search tools first for current schemas.
dotnet-backend
Build ASP.NET Core 8+ backend services with EF Core, auth, background jobs, and production API patterns.
dotnet-backend-patterns
Master C#/.NET backend development patterns for building robust APIs, MCP servers, and enterprise applications. Covers async/await, dependency injection, Entity Framework Core, Dapper, configuratio...