rent-roll-to-database

Transforms a tokenized/extracted rent roll into validated, typed, auditable, database-ready records: a multi-line charge schedule (base rent, CAM/tax/insurance recoveries, percentage rent, parking, storage) mapped to the canonical chart of accounts, lease- and unit-level facts, GPR and occupancy aggregates, a data-quality grade, and a target-model load plan. Tenant identity is pseudonymized; per-unit natural-person data never leaves the boundary. Triggers on 'load this rent roll into the database', 'normalize the rent roll to our schema', 'rent roll to warehouse', or when extracted rent-roll tokens need to become structured records before underwriting or tie-out.

6 stars

bymariourquia

View on GitHub Installation ↓

Best use case

rent-roll-to-database is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Teams using rent-roll-to-database should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

You only need a quick one-off answer and do not need a reusable workflow.
You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/rent-roll-to-database/SKILL.md --create-dirs "https://raw.githubusercontent.com/mariourquia/cre-skills-plugin/main/src/skills/rent-roll-to-database/SKILL.md"

Manual Installation

Download SKILL.md from GitHub
Place it in .claude/skills/rent-roll-to-database/SKILL.md inside your project
Restart your AI agent — it will auto-discover the skill

How rent-roll-to-database Compares

Feature / Agent	rent-roll-to-database	Standard Approach
Platform Support	Not specified	Limited / Varies
Context Awareness	High	Baseline
Installation Complexity	Unknown	N/A

Frequently Asked Questions

What does this skill do?

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

SKILL.md Source

# Rent Roll to Database

You are a CRE data engineer who turns messy, extracted rent-roll content into trustworthy, source-cited, database-ready records. You model a rent roll as a CONTRACT- and CHARGE-level cash-flow source, not a single rent number, because the strategic value is tying contractual charges to T-12 account-level actuals. You never guess: an ambiguous charge is flagged for human review, not silently mapped. You never emit a resident's name or per-unit identity; tenant identity is pseudonymized.

This skill is backed by deterministic, stdlib-only calculators in `src/calculators/` (it is not a black box): `normalize_tokens.py`, `map_charge_codes.py`, `validate_payload.py`, and `grade_ingestion.py`. Every number it produces is reproducible from its inputs.

## When to Activate

Explicit triggers:
- "load / ingest / normalize this rent roll into the database (or warehouse)"
- "map the rent roll charges to our chart of accounts"
- "get this rent roll ready for the rent roll <-> T-12 tie-out"

Implicit triggers:
- Extracted rent-roll tokens (from `document-to-data-room-extractor`, `rent-roll-analyzer`, or `rent-roll-formatter`) exist and must become typed, validated records with provenance before downstream underwriting / reconciliation.

Do NOT activate for:
- Rent-roll ANALYSIS (rollover, WALT, mark-to-market) — use `rent-roll-analyzer`.
- Standardizing a rent roll to an underwriting template — use `rent-roll-formatter`.
- Operating-statement ingestion — use `t12-to-database`.

## Input Schema

A tokenized rent roll passed to `normalize_tokens.py` via `--json` (or stdin). Selectors live in the payload, never as argv flags.

| Field | Type | Required | Notes |
|---|---|---|---|
| `doc_type` | string | no | `rent_roll` (or `auto`) |
| `as_of` | string | yes | ISO date; flows into `created_at`/`updated_at`. No wall clock is used. |
| `run_id` | string | no | Salts tenant pseudonyms; stamps `extraction_run_id`. |
| `tenant_id` | string | no | Tenancy/workspace label (path-validated; NOT an auth token). |
| `source` | object | no | `{document_id, file_name, document_type, table_id}` for provenance. |
| `property` | object | no | `{property_id, property_type, rentable_sf, units, market}`. |
| `rows` | array | yes | One object per unit/suite (see below). |

Each `rows[]` object: `unit`, `rentable_sf`, `unit_status` (occupied / vacant_available / leased_not_occupied / down / model / admin / employee / owner_occupied), `lease_status` (active / mtm / holdover / in_default / future_commencement / terminated), `tenant_name` (pseudonymized on ingest), `lease_start`, `lease_expire`, `lease_type`, `recovery_method` (nnn / modified_gross / full_service / base_year_stop / expense_stop), `base_year`, `expense_stop_psf`, `free_rent_months`, `escalation` (`{escalation_type, escalation_amount, next_escalation_date, ...}`), and `charges[]` — one object per charge line: `{charge_code, description, monthly_amount, annual_amount, frequency, is_recoverable, is_estimate, effective_start, effective_end}`.

See `references/rent-roll-field-dictionary.md` for the full field dictionary and accepted ranges, and `../document-to-database/references/charge-code-account-framework.md` for the charge-code/account mapping framework.

## Process

### Step 1: Normalize
Run `normalize_tokens.py` with `doc_type: rent_roll`. It decomposes each lease into typed charge-schedule lines, pseudonymizes tenant identity, computes GPR (in-place) and physical occupancy, and emits inline structural issues (lease expiry < start, vacant unit with an active lease, negative SF). Reuse the canonical charge categories and chart of accounts — do not invent a parallel taxonomy.

### Step 2: Map charges to accounts
For any charge whose code is unknown, `map_charge_codes.py` infers a category from the description at MEDIUM confidence and flags it for review. A charge with neither a known code nor a matchable description is `unmapped` and routed to human review — never guessed.

### Step 3: Validate
Run `validate_payload.py`. It checks `annual == monthly*12` within $1 (SKIPPED for free-rent / abatement / in-period-step leases, where the point-in-time identity legitimately fails), PSF reconciliation (branching on property type — PSF for commercial, per-unit for multifamily), occupancy in [0, 100], non-negative SF, and lease-date consistency. IMPOSSIBLE data (negative SF, occupancy > 100%, expiry < start) is CRITICAL; IMPLAUSIBLE data (a trophy-asset PSF outlier) is a WARNING that lowers confidence, never a hard rejection.

### Step 4: Grade and route
Run `grade_ingestion.py`. The weakest-link A/B/C grade is primary (a single C caps the grade); a 0-100 score is secondary. Merge requires >= 85 and no C and no critical failure; a PII-redaction breach is a non-overridable block. Unmapped charges and low-confidence inferences are surfaced in the human-review queue.

### Step 5: Hand off
The canonical payload feeds `rent-roll-t12-tieout` (Step into the reconciliation) and `map_to_target_model.py` / `emit_sql_ddl.py` / `emit_load_plan.py` for the chosen target-model profile.

## Output Format

A canonical rent-roll payload: `{doc_type, records (charge-schedule lines), leases, units, aggregates (gpr_in_place_annual, physical_occupancy_pct, ...), issues}`. Each record carries the provenance bundle (a superset of the 8-column warehouse contract) with `source_ref` in `data-room/<doc>#<anchor>` form, `pii_class`, and `redaction_status`. Plus: an account-mapping report, a validation report, a data-quality grade (A/B/C + 0-100), a human-review queue, and a target-model load plan.

## Red Flags

- A charge collapsed to a single rent number — recoveries and percentage rent cannot then tie to the T-12. Model the multi-line charge schedule.
- `annual == monthly*12` hard-failing a free-rent or stepped lease — that identity does not hold mid-abatement; it must be skipped, not failed.
- A resident name, per-unit actual rent tied to a named person, or a guarantor name appearing in any output — a hard-stop PII breach. Halt; do not deliver a partially redacted payload.
- A `vacant_available` unit carrying an active lease or in-place charges — a data-integrity contradiction.
- An unmapped charge silently dropped or guessed — flag it; never fabricate a mapping.

## Chain Notes

Upstream (produce the tokens this skill ingests): `document-to-data-room-extractor`, `rent-roll-analyzer`, `rent-roll-formatter`.
Downstream (consume this skill's records): `rent-roll-t12-tieout` (reconciliation), `document-to-database` (orchestration + target-model emission), `acquisition-underwriting-engine` (the contractual cash-flow spine).

Related Skills

t12-to-database

from mariourquia/cre-skills-plugin

Transforms a tokenized/extracted trailing-twelve-month operating statement into validated account-level monthly records. A constrained preset over operating-statement-to-database: it asserts line_type=actual over a (possibly partial) twelve-month trailing window, excludes Total/YTD aggregate columns, flags partial-year/lease-up as a warning (never synthesizing missing months), normalizes the expense sign convention so NOI is not inflated, and maps accounts to the canonical chart of accounts for the NOI bridge and the rent-roll tie-out. Triggers on 'load this T-12', 'normalize the trailing twelve into the database', or 'T-12 to warehouse'.

Market Rent Refresh

from mariourquia/cre-skills-plugin

Checks freshness of market rent and concession references against overlay staleness threshold; organizes a refresh plan (comp sources, shop list, 3rd-party export, submarket coverage); produces the intake bundle for `workflows/rent_comp_intake` to process. Outputs a current-state memo and a refreshed benchmark view on completion.

rent-roll-t12-tieout

from mariourquia/cre-skills-plugin

Reconciles a normalized rent roll against a normalized T-12 on a stated, consistent basis (annualized contractual vs recognized accrual; collected cash out of scope) and never forces a tie. Reconciles base rent, recoveries and other income (jointly, per the canonical chart), occupancy, and the EGI / NOI-revenue bridge; classifies each gap as mapping, timing, or missing on a deterministic signature; surfaces residual_unexplained; and routes every untied dimension to human review. Triggers on 'tie out the rent roll to the T-12', 'reconcile contractual rent to actuals', 'NOI bridge from the rent roll', or 'revenue leakage check'.

rent-roll-formatter

from mariourquia/cre-skills-plugin

Standardizes rent roll data from any source format into a consistent underwriting template, validates data integrity (SF reconciliation, revenue reconciliation, date consistency, rent reasonableness), and calculates derived analytics (WALT, rollover, concentration, mark-to-market).

rent-roll-analyzer

from mariourquia/cre-skills-plugin

Ingests raw rent rolls (pasted table, CSV, or PDF extract) and produces a clean dataset with layered analytics: rollover schedule, mark-to-market waterfall, tenant concentration risk, WALT, rent benchmarking, MTM exposure, and data quality flags. Triggers on 'analyze this rent roll', 'clean up this rent roll', or when rent roll data needs preprocessing before underwriting.

rent-optimization-planner

from mariourquia/cre-skills-plugin

Quantitative rent optimization framework with loss-to-lease waterfall analysis, renewal probability modeling, effective rent NPV comparison across aggressive/moderate/retention strategies, valuation impact quantification, and market cycle overlay. Maximizes long-term property value, not just next-quarter revenue. Triggers on 'rent raise plan', 'rent optimization', 'loss-to-lease', 'renewal pricing', or when planning rent increases across a portfolio.

operating-statement-to-database

from mariourquia/cre-skills-plugin

Transforms a tokenized/extracted operating statement (any line type: actual, budget, reforecast, prior-year; any period grain; monthly, quarterly, annual-summary, or multi-scenario) into validated, account-level, period-level database records mapped to the canonical chart of accounts. Format-aware period handling excludes Total/YTD aggregate columns, normalizes the expense sign convention, detects duplicate/subtotal lines, and keeps capex and debt service below the NOI line. Triggers on 'load this operating statement', 'normalize the P&L to our accounts', or when extracted operating-statement tokens must become governed account-level records.

document-to-database

from mariourquia/cre-skills-plugin

Executable orchestrator that turns tokenized/extracted CRE document content (rent rolls, T-12s, operating statements, Prose Frontier narrative artifacts) into validated, typed, auditable, target-model-ready database payloads. Canonical flow: classify, identify fields, coerce types, normalize, map charge codes to the chart of accounts, validate, score confidence, emit an issue report, map to a target database model, emit optional SQL DDL and a load plan, self-grade, and route ambiguous items to a human-review queue. Backed by deterministic stdlib calculators; fail-closed when a citation cannot be made; tenant identity pseudonymized. Triggers on 'turn these documents into a database', 'ingest this data room to our schema', 'document to warehouse', or when extracted tokens must become governed structured data.

workout-playbook

from mariourquia/cre-skills-plugin

Produces a lender-side workout and restructuring playbook for distressed CRE loans. Maps all resolution paths (forbearance, A/B note split, DPO, deed-in-lieu, foreclosure, note sale), models NPV of each, assesses borrower leverage, and recommends optimal strategy with timeline.

Work Order Triage

from mariourquia/cre-skills-plugin

Classifies work order urgency from free-text descriptions, assigns priority (P1-P4) with SLA deadlines, estimates cost, checks lease responsibility, and routes to the correct approval path.

warehouse-to-exhibit-mapper

from mariourquia/cre-skills-plugin

Maps validated, warehouse-ready tabular datasets into deck-ready EXHIBIT specifications and slide inputs. Selects table vs. chart per exhibit, names axes and series, maps source dataset columns to exhibit fields, binds each exhibit to a target slide, and carries provenance THROUGH so every exhibit cell keeps its source_ref and classification. Triggers on 'map this to exhibits', 'turn the dataset into slides', 'build the exhibit specs', or when a validated dataset must become charts and tables for a committee deck. It specifies exhibits; it does not render pixels or compose the full deck.

vendor-invoice-validator

from mariourquia/cre-skills-plugin

Validates vendor invoices against contract terms, scope of work, and market rates. Checks arithmetic, rate compliance, scope authorization, duplicate detection, GL coding, and NTE/cap limits. Assigns APPROVED, APPROVED WITH FLAGS, or HOLD FOR REVIEW verdict.