2026-legal-research-agent

Expert legal research agent for finding and scraping expungement data state by state. Knows authoritative sources, URL patterns, Firecrawl configuration, and 2026 legal landscape. Activate on "find expungement data", "scrape state laws", "legal research", "court URLs", "statute sources", "Clean Slate laws", "automatic expungement research". NOT for interpreting laws (use national-expungement-expert), building UI, or legal advice.

85 stars

Best use case

2026-legal-research-agent is best used when you need a repeatable AI agent workflow instead of a one-off prompt.

Expert legal research agent for finding and scraping expungement data state by state. Knows authoritative sources, URL patterns, Firecrawl configuration, and 2026 legal landscape. Activate on "find expungement data", "scrape state laws", "legal research", "court URLs", "statute sources", "Clean Slate laws", "automatic expungement research". NOT for interpreting laws (use national-expungement-expert), building UI, or legal advice.

Teams using 2026-legal-research-agent should expect a more consistent output, faster repeated execution, less prompt rewriting.

When to use this skill

  • You want a reusable workflow that can be run more than once with consistent structure.

When not to use this skill

  • You only need a quick one-off answer and do not need a reusable workflow.
  • You cannot install or maintain the underlying files, dependencies, or repository context.

Installation

Claude Code / Cursor / Codex

$curl -o ~/.claude/skills/2026-legal-research-agent/SKILL.md --create-dirs "https://raw.githubusercontent.com/curiositech/some_claude_skills/main/.claude/skills/2026-legal-research-agent/SKILL.md"

Manual Installation

  1. Download SKILL.md from GitHub
  2. Place it in .claude/skills/2026-legal-research-agent/SKILL.md inside your project
  3. Restart your AI agent — it will auto-discover the skill

How 2026-legal-research-agent Compares

Feature / Agent2026-legal-research-agentStandard Approach
Platform SupportNot specifiedLimited / Varies
Context Awareness High Baseline
Installation ComplexityUnknownN/A

Frequently Asked Questions

What does this skill do?

Expert legal research agent for finding and scraping expungement data state by state. Knows authoritative sources, URL patterns, Firecrawl configuration, and 2026 legal landscape. Activate on "find expungement data", "scrape state laws", "legal research", "court URLs", "statute sources", "Clean Slate laws", "automatic expungement research". NOT for interpreting laws (use national-expungement-expert), building UI, or legal advice.

Where can I find the source code?

You can find the source code on GitHub using the link provided at the top of the page.

Related Guides

SKILL.md Source

# 2026 Legal Research Agent

---

## When to Use This Skill

Use this skill when you need to:

- **Find authoritative legal sources** for a specific state's expungement laws
- **Configure Firecrawl jobs** to scrape court systems, legislatures, or legal aid sites
- **Validate scraped data** for accuracy and completeness
- **Research 2026 law changes** including Clean Slate acts and marijuana expungement
- **Build URL patterns** for systematic state-by-state data collection
- **Identify gaps** in existing scraped data coverage

**Do NOT use this skill for:**
- Interpreting what laws mean (use `national-expungement-expert`)
- Building user interfaces or components
- Providing legal advice to users
- General web scraping unrelated to legal data

---

## Core Instructions

### 1. Authoritative Source Hierarchy

When researching expungement laws, prioritize sources in this order:

```
Tier 1 (Primary Authority):
├── State Legislature websites (statute text)
├── State Court Administrative Office
└── State Attorney General publications

Tier 2 (Official Secondary):
├── State Bar Association guides
├── Court self-help centers
└── Public law databases (public.law, justia.com)

Tier 3 (Tertiary but Valuable):
├── Legal aid organizations (LSC grantees)
├── Law school clinics
└── Reentry organizations (CCRC, NACDL)

Tier 4 (Verification Only):
├── Commercial legal databases
├── News articles about law changes
└── Attorney blog posts
```

**Shibboleth**: A novice scrapes the first Google result. An expert knows that `courts.{state}.gov` contains the self-help forms while `legislature.{state}.gov` contains the statute text—and both are needed.

### 2. URL Pattern Knowledge by State Type

States organize their legal resources differently. Know the patterns:

**Unified Court Systems** (courts own everything):
```
California: courts.ca.gov/selfhelp-expungement.htm
Oregon: courts.oregon.gov/programs/exp/Pages/default.aspx
Washington: courts.wa.gov/forms/?fa=forms.contribute&formID=101
```

**Split Systems** (legislature + court separate):
```
Texas: txcourts.gov (forms) + texas.public.law (statutes)
New York: nycourts.gov (forms) + nysenate.gov/legislation/laws (statutes)
Florida: flcourts.gov (forms) + leg.state.fl.us/statutes (statutes)
```

**Public.law States** (excellent statute hosting):
```
oregon.public.law, california.public.law, texas.public.law
michigan.public.law, washington.public.law
```

**Shibboleth**: Knowing that `apps.leg.wa.gov/RCW/` is Washington's statute database while `leg.wa.gov` is the general legislature site—the RCW subdomain is where the actual law text lives.

### 3. 2026 Legal Landscape Awareness

As of 2026, these major changes affect research:

**Clean Slate States** (automatic expungement passed):
- Pennsylvania (2018), Utah (2019), New Jersey (2019), Michigan (2020)
- California (2020), Connecticut (2021), Delaware (2021), Virginia (2021)
- Oklahoma (2022), Colorado (2022), New York (2023), Minnesota (2023)
- Maryland (2024), Illinois (2024), Oregon (2025)

**Marijuana Expungement** (specific statutes):
- Most states now have separate marijuana expungement provisions
- Search for "cannabis conviction" alongside "expungement"
- Check for retroactive application dates

**2025-2026 Law Changes to Verify**:
- Oregon HB 2316 (expanded eligibility)
- California AB 1076 (automatic relief expansion)
- Check CCRC's Restoration of Rights Project for current status

**Shibboleth**: Knowing that "automatic expungement" doesn't mean immediate—Pennsylvania's Clean Slate has a 10-year waiting period for arrests and varies by offense. Research must capture these nuances.

### 4. Firecrawl Configuration Expertise

When configuring scrape jobs:

**Extraction Schema Design**:
```typescript
// For statute pages, extract:
{
  statuteCitation: "string",   // e.g., "ORS 137.225"
  title: "string",             // e.g., "Setting aside conviction"
  fullText: "string",          // Complete statute text
  effectiveDate: "string",     // When current version took effect
  lastAmended: "string",       // Most recent amendment date
  subsections: "array",        // Parsed subsections
}

// For court self-help pages, extract:
{
  stateName: "string",
  expungementPageUrl: "string",
  formsLibraryUrl: "string",
  selfHelpUrl: "string",
  contactPhone: "string",
  feeScheduleUrl: "string",
}

// For forms, extract:
{
  formNumber: "string",        // e.g., "MC-440"
  formTitle: "string",
  pdfUrl: "string",
  applicableTo: "array",       // ["misdemeanor", "arrest"]
  lastUpdated: "string",
}
```

**Rate Limiting for Government Sites**:
```typescript
rateLimit: 2,  // 2 requests/second max for .gov sites
timeout: 90000,  // Government sites can be slow
maxRetries: 3,  // Retry on timeout
waitFor: 3000,  // Wait for JavaScript on modern court sites
```

**Shibboleth**: Knowing to set `onlyMainContent: true` for statute pages (to skip navigation chrome) but `onlyMainContent: false` for forms pages (where the form links are often in sidebars).

### 5. Data Validation Checklist

After scraping, validate:

```
□ Statute citations match official format (e.g., "ORS" not "Or. Rev. Stat.")
□ Effective dates are parseable and reasonable (not future, not too old)
□ URLs are live and return 200 status
□ PDF form links actually download PDFs (not HTML error pages)
□ Phone numbers are in consistent format
□ Fee amounts are numeric and reasonable ($0-$500 typical range)
□ State code extracted correctly (watch for ambiguous URLs)
```

**Common Extraction Errors**:
- "oregon.public.law" matching "la" (Louisiana) instead of "or" (Oregon)
- Statute text truncated at 10,000 characters (increase limit)
- Form "last updated" dates in inconsistent formats
- County-specific URLs mistaken for state-level

### 6. Gap Analysis Process

To identify missing data for a state:

```bash
# Check what we have
ls src/data/scraped/states/{state}/

# Expected files for complete coverage:
# - statutes.json (eligibility rules from statute text)
# - court-system.json (court URLs, contacts, forms links)
# - forms/ (actual PDF forms)
# - fees.json (filing fee amounts)
# - counties/ (county-specific court data)

# Cross-reference with state data file
grep -l "waitingPeriods\|eligibilityRules" src/data/states/{state}.ts
```

**Priority order for filling gaps**:
1. Statutes (foundation for all rules)
2. Court forms (what users actually need to file)
3. Fee information (users need to budget)
4. County contacts (where to file)
5. Spanish resources (accessibility)

---

## Anti-Patterns

### Never Do These:

1. **Scrape without rate limiting** - Government sites will block you
2. **Trust secondary sources for statute text** - Always verify against primary
3. **Assume URL patterns are consistent** - Each state is different
4. **Ignore effective dates** - Laws change; scraped data needs timestamps
5. **Scrape county sites without state context** - County rules supplement, not replace, state law
6. **Skip the self-help sections** - Often have the clearest eligibility summaries
7. **Treat all states the same** - Clean Slate states have fundamentally different processes

### Common Mistakes:

```
❌ Scraping Wikipedia for statute text
✅ Scraping the state legislature's official code

❌ Using findlaw.com as primary source
✅ Using findlaw.com to find the citation, then scraping the official source

❌ Assuming "expungement" is the only term
✅ Searching for: expungement, sealing, set-aside, dismissal, destruction, pardons

❌ Treating waiting periods as simple numbers
✅ Capturing offense-specific waiting periods (felonies vs misdemeanors vs arrests)
```

---

## Project-Specific Context

This skill is designed for the **National Expungement Guide** project:

### Existing Infrastructure

- **Firecrawl scripts**: `scripts/firecrawl/`
- **Job definitions**: `scripts/firecrawl/jobs.ts` (P0-P4 priority jobs)
- **URL config**: `scripts/firecrawl/config.ts` (all 50 states)
- **Output path**: `src/data/scraped/states/{state}/`
- **State data**: `src/data/states/` (TypeScript files per state)

### Running Scrapes

```bash
# Set API key first
export FIRECRAWL_API_KEY=your_key

# Run P0 (state statutes + courts) - ~$0.20 cost
npx tsx scripts/firecrawl/run-p0.ts

# Dry run to preview
npx tsx scripts/firecrawl/run-p0.ts --dry-run

# Check reports
cat scripts/firecrawl/reports/p0-*.json
```

### Data Flow

```
Firecrawl scrape → src/data/scraped/{state}/*.json
       ↓
Manual review + cleanup
       ↓
Integrated into src/data/states/{state}.ts
       ↓
Used by eligibility wizard + PDF generator
```

---

## References

See `references/` folder for:
- `url-patterns-by-state.md` - Complete URL patterns for all 50 states
- `clean-slate-timeline.md` - When each Clean Slate law passed and took effect
- `firecrawl-schemas.md` - All extraction schemas used

---

## Example Workflow

**User request**: "Research California's 2026 expungement laws and scrape the latest data"

**Agent workflow**:
1. Check existing data: `ls src/data/scraped/states/ca/`
2. Verify current statute version at `california.public.law`
3. Check for 2025-2026 law changes via CCRC or news search
4. Update `scripts/firecrawl/config.ts` if URLs changed
5. Run targeted scrape: add CA-specific URLs to P0 job
6. Validate extracted data against known statute citations
7. Document any gaps or changes found

Related Skills

3d-cv-labeling-2026

85
from curiositech/some_claude_skills

Expert in 3D computer vision labeling tools, workflows, and AI-assisted annotation for LiDAR, point clouds, and sensor fusion. Covers SAM4D/Point-SAM, human-in-the-loop architectures, and vertical-specific training strategies. Activate on '3D labeling', 'point cloud annotation', 'LiDAR labeling', 'SAM 3D', 'SAM4D', 'sensor fusion annotation', '3D bounding box', 'semantic segmentation point cloud'. NOT for 2D image labeling (use clip-aware-embeddings), general ML training (use ml-engineer), video annotation without 3D (use computer-vision-pipeline), or VLM prompt engineering (use prompt-engineer).

research-craft

85
from curiositech/some_claude_skills

Systematic research methodology for formulating problems, building evidence-based arguments, and communicating findings in academic contexts. For scholarly writing, thesis development, and knowledge contribution. NOT for creative writing, journalistic reporting, or business analytics.

research-analyst

85
from curiositech/some_claude_skills

Conducts thorough landscape research, competitive analysis, best practices evaluation, and evidence-based recommendations. Expert in market research and trend analysis.

recovery-app-legal-terms

85
from curiositech/some_claude_skills

Generate legally-sound terms of service, privacy policies, and medical disclaimers for recovery and wellness applications. Expert in HIPAA, GDPR, CCPA compliance. Activate on 'terms of service', 'privacy policy', 'legal terms', 'medical disclaimer', 'HIPAA', 'user agreement'. NOT for contract negotiation (use attorney), app development (use domain skills), or moderation (use recovery-community-moderator).

modern-auth-2026

85
from curiositech/some_claude_skills

Modern authentication implementation for 2026 - passkeys (WebAuthn), OAuth (Google, Apple), magic links, and cross-device sync. Use for passwordless-first authentication, social login setup, Supabase Auth, Next.js auth flows, and multi-factor authentication. Activate on "passkeys", "WebAuthn", "Google Sign-In", "Apple Sign-In", "magic link", "passwordless", "authentication", "login", "OAuth", "social login". NOT for session management without auth (use standard JWT docs), authorization/RBAC (use security-auditor), or API key management (use api-architect).

skill-coach

85
from curiositech/some_claude_skills

Guides creation of high-quality Agent Skills with domain expertise, anti-pattern detection, and progressive disclosure best practices. Use when creating skills, reviewing existing skills, or when users mention improving skill quality, encoding expertise, or avoiding common AI tooling mistakes. Activate on keywords: create skill, review skill, skill quality, skill best practices, skill anti-patterns. NOT for general coding advice or non-skill Claude Code features.

wisdom-accountability-coach

85
from curiositech/some_claude_skills

Longitudinal memory tracking, philosophy teaching, and personal accountability with compassion. Expert in pattern recognition, Stoicism/Buddhism, and growth guidance. Activate on 'accountability', 'philosophy', 'Stoicism', 'Buddhism', 'personal growth', 'commitment tracking', 'wisdom teaching'. NOT for therapy or mental health treatment (refer to professionals), crisis intervention, or replacing professional coaching credentials.

windows-95-web-designer

85
from curiositech/some_claude_skills

Modern web applications with authentic Windows 95 aesthetic. Gradient title bars, Start menu paradigm, taskbar patterns, 3D beveled chrome. Extrapolates Win95 to AI chatbots, mobile UIs, responsive layouts. Activate on 'windows 95', 'win95', 'start menu', 'taskbar', 'retro desktop', '95 aesthetic', 'clippy'. NOT for Windows 3.1 (use windows-3-1-web-designer), vaporwave/synthwave, macOS, flat design.

windows-3-1-web-designer

85
from curiositech/some_claude_skills

Modern web applications with authentic Windows 3.1 aesthetic. Solid navy title bars, Program Manager navigation, beveled borders, single window controls. Extrapolates Win31 to AI chatbots (Cue Card paradigm), mobile UIs (pocket computing). Activate on 'windows 3.1', 'win31', 'program manager', 'retro desktop', '90s aesthetic', 'beveled'. NOT for Windows 95 (use windows-95-web-designer - has gradients, Start menu), vaporwave/synthwave, macOS, flat design.

win31-pixel-art-designer

85
from curiositech/some_claude_skills

Expert in Windows 3.1 era pixel art and graphics. Creates icons, banners, splash screens, and UI assets with authentic 16/256-color palettes, dithering patterns, and Program Manager styling. Activate on 'win31 icons', 'pixel art 90s', 'retro icons', '16-color', 'dithering', 'program manager icons', 'VGA palette'. NOT for modern flat icons, vaporwave art, or high-res illustrations.

win31-audio-design

85
from curiositech/some_claude_skills

Expert in Windows 3.1 era sound vocabulary for modern web/mobile apps. Creates satisfying retro UI sounds using CC-licensed 8-bit audio, Web Audio API, and haptic coordination. Activate on 'win31 sounds', 'retro audio', '90s sound effects', 'chimes', 'tada', 'ding', 'satisfying UI sounds'. NOT for modern flat UI sounds, voice synthesis, or music composition.

wedding-immortalist

85
from curiositech/some_claude_skills

Transform thousands of wedding photos and hours of footage into an immersive 3D Gaussian Splatting experience with theatre mode replay, face-clustered guest roster, and AI-curated best photos per person. Expert in 3DGS pipelines, face clustering, aesthetic scoring, and adaptive design matching the couple's wedding theme (disco, rustic, modern, LGBTQ+ celebrations). Activate on "wedding photos", "wedding video", "3D wedding", "Gaussian Splatting wedding", "wedding memory", "wedding immortalize", "face clustering wedding", "best wedding photos". NOT for general photo editing (use native-app-designer), non-wedding 3DGS (use drone-inspection-specialist), or event planning (not a wedding planner).